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ABSTRACT 

We present the first hydrodynamical simulations of structure formation using the new 
moving mesh code AREPO and compare the results with GADGET simulations based on a tra- 
ditional smoothed particle hydrodynamics (SPH) technique. The two codes share the same 
Tree-PM gravity solver and include identical sub-resolution physics for the treatment of star 
formation, but employ different methods to solve the equations of hydrodynamics. This allows 
us to assess the impact of hydro-solver uncertainties on the results of cosmological studies of 
galaxy formation. In this paper, we focus on predictions for global baryon statistics, such as 
the cosmic star formation rate density, after we introduce our simulation suite and numerical 
methods. Properties of individual galaxie s and haloes are examined by ,Keres et al.| ( [2QTT] ), 



while a third paper by |Sijacki et al.lpQTT) uses idealised simulations to analyse in more detail 
the differences between the hydrodynamical schemes. We find that the global baryon statis- 
tics differ significantly between the two simulation approaches. AREPO shows systematically 
higher star formation rates at late times, lower mean temperatures averaged over the simu- 
lation volume, and different gas mass fractions in characteristic phases of the intergalactic 
medium, in particular a reduced amount of hot gas. Although both hydrodynamical codes use 
the same implementation of cooling and yield an identical dark matter halo mass function, 
more gas cools out of haloes in AREPO compared with GADGET towards low redshifts, which 
results in corresponding differences in the late-time star formation rates of galaxies. We show 
that this difference is caused by a higher heating rate with SPH in the outer parts of haloes, 
owing to viscous dissipation of SPH's inherent sonic velocity noise and SPH's efficient damp- 
ing of subsonic turbulence injected in the halo infall region, and because of a higher efficiency 
of gas stripping in AREPO. As a result of such differences, AREPO leads also to more disk- 
like morphologies in the moving mesh calculation compared to GADGET. Our results hence 
indicate that inaccuracies in hydrodynamic solvers can lead to comparatively large systematic 
differences even at the level of predictions for the global state of baryons in the universe. 
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1 INTRODUCTION 

Current theories of galaxy formation are based on the view that the 
dominant mass contribution to the Universe is in the form of cold 
dark matter (DM), which clusters under gravity and builds up the 
backbone of all cosmic structure. Together with the discovery of 
a large dark energy (DE) component (Ries s et al.|1999| |Perlmut-| 
|ter et al. 1999), this led to the emergence of the concordance A 
cold dark matter (ACDM) cosmogony. The third component to the 
energy density in this scenario, the baryons, condense via radia- 



tive cooling at the centres of a population of hierarchically merging 
DM haloes, formin g galaxies (tSilk||1977| |Rees & Ostriker|[T977l 



[White & Rees|1978[|Blumenthal et aL|1984| ). Although th e precise 
physical nature of DM and DE is not yet known (but see [Bertonej 
|eFaL|[2005 for a review of possible DM candidates), large-scale 
predictions of this ACDM theory show good agreement with a wide 
range of observations, among them cosmic microwave background 
(CMB) fluctuations ( [Dunkley et al.|2009|, large-scale clus tering of 
galaxies in the low-redshift universe ( Percival et aL|2010| |, and the 
redshift z ^ 2 — 3 power spectrum probed by the Lyman-a forest 
( |Viel et al.|2009| ). It is however crucial to further constrain and test 
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the ACDM paradigm, especially on the smaller scales of galaxies, 
which remains a theoretical and observational frontier. 

Structure formation is a highly non-linear process and only 
the early stages can be described analytically using linear the- 
ory ( |Zel ' Dovich| 1 970| ) . However, some statistical properties of the 
evolved DM field, in particular the halo mass function, can also 
be predicted analytically by combining linear theory and spherical 
collapse models, through the Press-Schechter formalism ( Press &| 
|Schechter '1974 ) and subsequent extensions (e.g. Bond et al.|1991|). 



These predictions ultimately rely on calibration through more accu- 
rate calculations, which typically can only be done numerically. In- 
deed, while Press-Schechter theory has been quite successful in de- 
scribing the abundances of haloes, once computer models reached 
high precision it became clear that the original analytic treatment 
was not sufficiently accurate for the quantitative work required in 
today's era of precision cosmology (e.g. [Jenkins et al.|2001| ). 

Over the last two decades, numerical simulations have played 
a key role in guiding our knowledge of structure and galaxy for- 
mation. This is especially evident for the CDM component, where 
the relevant calculations have reached a high level of sophistica- 
tion (e.g. Sprin gel et al. 2005 2008 , Di emand et al.|2008l|Boylan-| 
[Kolchin et aH2010f |Klypin et al. 2010). Many important insights 
have been gained from purely collisionless simulations, for exam- 
ple the nearly universal density profiles of DM haloes ( Navarro] 



|eraLl|T996l [T9971 |2010j), assembly bias ( |Gao e t al."200 5]), and 
the internal structure of haloes on small scales (Vogelsberg er et al.| 
[2009 1 1 Vogelsberger & White'2011a). An important reason for to- 
day's reliability of CDM predictions is that the initial conditions 
are unambiguously specified in the ACDM model, with parame- 
ters that are tightly constrained by CMB observations ( [Komatsu] 
|et al.|[20TT| ). Moreover, the computational problem is well-posed 
and comparatively simple, with equations of motion that for DM 
involve only gravity. Efficient new algorithms and the rapid growth 
of computing power over the few last decades have allowed ever 
more detailed theoretical predictions for the DM distribution. It is 
worthwhile, however, to recall that initially such collisionless DM 
simulations suffered from numerical artifacts, leading to issues like 
the infamous "over-merging problem", which resulted in feature- 
less and smooth DM haloes primarily due to insufficient mass-, 
force- and time-resolution ( [Moore et al.|1998|p^ 99 ). 

To understand properties of the observable universe one needs 
to relate the dark matter distribution to that of the baryonic material. 
This can be done in a number of different ways: (i) using so-called 
semi-analytical modelling of galaxy forma tion (e.g. [W hite & Frenk 
1991 , Kauffmann et al.| 199'3l|Baugh|20 06 , Croton et al. 2006[ |Ben-| 



son^ 2010a b , ^Guo et al.||2011|) or the closely rela ted approach of 
halo occupation distributions ( [Benson et al.|2000|), or (ii) usin g di- 



rect hydrodynamical simulations (e.g^Hernquist & Katz"1989 
etal. 1992 , Bryan & Norman 1998 , Dave et al. 1999 , Springe 



]Katz 
et al. 



2005|[Governato et al. 2010). Approach (i) allows a fast exploration 



of a large parameter space of the underlying coarse description of 
galaxy formation physics, whereas (ii) makes possible the calcula- 
tion of a self-consistent model that requires far fewer assumptions 
about the gas dynamics. Nevertheless, both methods share the com- 
mon problem that complicated physics on very small scales, like 
star formation and associated feedback processes, need to be ac- 
counted for in a rather crude way. This propagates into uncertain- 
ties in the interpretation of simulation results for the hydrodynamic 
sector. One possible strategy to get better control over this prob- 
lem is to calibrate sub-resolution treatments in semi-analytic mod- 
els and hydrodynamical simulations observationally (e.g. jSchaye[ 
[et al.|2010|[Guo et al.|2011| ). In addition, numerical artifacts in hy- 



drodynamical simulation techniques need to be better understood, 
hopefully allowing one to reduce or completely eliminate them. 

State-of-the-art cosmological hydrodynamical simulations in- 
clude a prescription for star formation (e.g. 'Springel & HernquistJ 
[2003al ), radiative cooling (fKatz et al. 111996, S mith et al. 2008 ), 
chemical enrichment (e.g. Wiersma et al. 2009| and various feed- 
back processes relevant for low- and high-mass systems (e.g. [Si-[ 
yacki et al.[2007|[S"cannapieco et al.[2008] ). Furthermore, some sim- 
ulation codes can also include magne tic fields ([C ollins et al. 2010! 
[Dolag et al. 1200 5), ra diative transfer ( [Cantalupo & Porciani 2011^ 
Petkova & Spri ngel [[201 01), cosmic ray physics ([Jubelgas et al.[ 
|2008| ), thermal conduction f Jubelgas et al. '2004', "Dolag et al.[2004l)7 
or black hole physics ( Di Matteo et al. 2005 , Springel et al.|2005[ 
[Di Matteoeral.|2 008 ). 

Hydrodynamical simulations have been successfully used to 
study the Lyman-a-forest (e.g. [Hernquist et al.[[7996{ [Katz et ah] 
[1996|[Viel et al.|2005|) and the detailed physics of the intracluster 
medium, where cluster scaling relations were obtained that agreed 
with X-ray observations ( [Puchwein et al.[[2008| ). There have also 
been numerous studies that focused on the formation of a represen- 
tative galaxy populatio n (e.g.[Pearce et aH1 999 Mu rali et al.[2002[ 
[Weinberg et~ari|2004l [Nagamine et ari[2005nCrain et al.[[2009l ). 
While some successes have been achieved here as well, a general 
finding of these simulations is that it appears to be difficult to re- 
produce the observed shallow slope of the faint-end of the galaxy 
luminosity function and the observed morphological mix of galax- 
ies, with its high abundance of disk-like galaxies. Although there 
has been some considerable progress over the last few years on 
the latter issue (e.g. [Robertson et al.|2004|[Governato et al.|2007 



Scannapiec o et al.|2008|[Agertz et ah|2011[[Governato et al.|20To 
Guedes et al .|2011| ), one of the outstanding problems is to produce 
a statistical sample of galaxies that agrees reasonably well with ob- 
servations and reproduces the Hubble sequence. But modifications 
in the detailed feedback and star formation prescriptions proved 
to be important and successful in forming more realistic late type 
spiral galaxies. We note that cosmological hydrodynamic simula- 
tions have also made it clear that a proper inclusion of the baryonic 
physics can lead to interesting back reactions on the DM distribu- 
tion ( [Duffy et al.[2010[[D'Onghia et al.[2010|), potentially eliminat - 
ing or reducing the central dark matter cusp ( jGovernato et al.|2010| l, 
or forming a dark disk ( [Read et al.|2008 |). It is hence clear that it is 
ultimately not sufficient to work with dark-matter only simulations. 

Most hydrodynamical simulations of galaxy formation car- 
ried out thus far have used the smoothed particle hydrodynamics 
(SPH) techniqu e Cucy|[T977l [Gingold & Monagha n[p^ [Mon] 
[aghan[1992 2005)), where the gas is discretised into a set of par- 
ticles for which appropriate equations of motion can be derived. 
The SPH method is well- suited for cosmological applications ow- 
ing to its pseudo-Lagrangian character, which automatically brings 
resolution elements to regions where they are needed most, i.e. col- 
lapsing objects like clusters and galaxies. Also, the conservation 
properties of SPH in terms of simultaneous conservation of energy, 
momentum, mass, entropy and angular momentum are excellent. 
A further convenient feature is that a particle-based gravity solver 
(needed for the DM component anyway) can be readily applied to 
SPH. These properties have made the method also popular for ap- 
plications other than cosmic structure growth, for example for iso- 
lated merger simulations fBarnes & Hernquist 199 2[ [1991 [[1996[ 
Mihos & Hernqui st 1994 1996, Naab & Burkert 2003 1 [Cox et'al] 



2006r 



However, different methods for solving the equations of hy- 
drodynamics in a cosmological context have been employed as 
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well, most of which are descendants of classic Eulerian methods 
( [Stone & Norman||l992i ) on regular meshes. The fixed Cartesian 
grids originally available in the corresponding codes are insuffi- 
cient to capture the large dynamic ranges encountered in galaxy for- 
mation, but finite volume schemes incorporated in modern adaptive 
mesh refinement (AMR) codes (Berge r & Colella|p9 89 , Teyssier 
|2002[|O'Shea et al.|20Q4] l can alleviate this problem. Interestingly, 
these mesh-based methods have led in some cases to significantly 
different results compared with SPH. A prominent example of these 
discrepancies is highlighted in the Santa Barbara cluster compari- 
son project ( [Frenk et al.|1999| l, where it was found that the entropy 
profiles of clusters are significantly and systematically different be- 
tween AMR and SPH codes, the former yielding a large entropy 
core which is absent in the SPH calculations. Only recently has 
some progress been made in identifying the cause for this differ- 
ence ( [Mitchell et al.|200'9a| , but the issue is still not fully under- 
stood; hence we return to it in one of our companion papers fSijacki[ 
[etal.|2011| ). 

It thus appears that there are at least two important challenges 
in the field of cosmological hydrodynamics simulations. One lies in 
an adequate treatment of all relevant physics of galaxy formation, 
in particular with respect to the reliability of the parameterisations 
of sub-resolution physics, and the other having to do with the accu- 
racy and efficiency of the underlying hydrodynamics solver. It is of 
course crucial to strive for a more faithful treatment of the physics 
in the simulations, but it should also be evident that a proper mod- 
elling of the physics relies on a correct solution of the basic hydro- 
dynamical equations in the first place. It is therefore problematic 
to tune the sub-resolution physics without first verifying in detail 
the quality of the hydro solver, because this risks incorrectly ab- 
sorbing deficiencies of a numerical technique into distorted physics 
models. Here, in this initial study we shall primarily be concerned 
with an assessment of the extent to which uncertainties in numeri- 
cal methodology are reflected in the predicted galaxy properties, for 
a fixed physics model. In future works, we will explore the conse- 
quences of various treatments of feedback effects, which are known 
to be crucial for the galaxy formation problem ( [Scannapieco et al.j 
|20Tl) . 

We remark that it is not obvious whether SPH or AMR is 
more accurate in all simulation regimes, as both methods are known 
to have shortcomings. For example, SPH suffers from relatively 
poor shock resolution, noise on the scale of the smoothing ker- 
nel, and low-order accuracy for the treatment of contact discon- 
tinuities. Furthermore, some hydrodynamic instabilities like the 
Kelvin-Helmholtz instability can be suppressed in SPH ( [Agertz[ 
[et aL|200 7 ). Recently, Bauer & Springel (2011) also showed that 
conventional implementations of SPH do not properly properly 
subsonic turbulence, which is generically present in cosmological 
haloes. Astrophysical Eulerian methods on the other hand (usually 
realised as AMR finite-volume schemes) may suffer from over- 
mixing due to advection errors in the presence of bulk flows. One of 
the principal differences between SPH and AMR lies in their han- 
dling of mixing at the level of individual fluid elements. This is ab- 
sent in SPH by construction (unless added in crudely by hand some- 
how), while in AMR it occurs implicitly through the averaging of 
the evolved Riemann solutions over the scale of the grid-cells at the 
end of each timestep. It is not always clear which scheme yields a 
more accurate result ( [Trac et al. [2007 [[Mitchell et al.|20()9b )). 

Another issue with classical AMR codes is that their handling 
of the discrete equations of motion can lead to errors with the fluid 
is moving rapidly across the mesh ( [Springel[2010a[ hereafter SIO). 
Because the truncation error of Eulerian codes depends on the fluid 



velocity relative to the grid, the results can thus be sensitive to the 
presence of bulk velocities (see also [Tasker et al.[[2008^ Wadsley[ 
[et al.|2068] ). Finally, a further challenge for AMR schemes in cos- 
mic structure formation arises from the need to accurately follow 
the gravitational growth of even very small structures. The mesh- 
based Poisson solvers typically employed by AMR codes have been 
shown to lack sufficient small-scale force accuracy and to produce 
too few low mass haloes ( [O'Shea et al.[2005[[He'itmann et al.[2008[ ) 
potentially corrupting the solution on even well-resolved scales. 
While this can be addressed with more aggressive refinement strate- 
gies, the discontinuous changes in gravity resolution brought about 
by such refinements introduce non-Hamiltonian perturbations into 
the dark matter dynamics which are in principle undesirable. 

Recently, SIO introduced a new moving-mesh approach as 
embodied in the AREPO code. The principal goal is similar to the 
earlier implementations of [Gnedin] ( [1995[ ) and [Pen[fl998[ ), but these 
moving-mesh algorithms suffered from grid distortions which lim- 
ited their applicability. AREPO does not use coordinate transforma- 
tions like previous moving mesh codes in cosmology, but instead 
employs an unstructured Voronoi tessellation of the computational 
domain. The mesh-generating points of this tessellation are allowed 
to move freely, offering significant flexibility for representing the 
geometry of the flow. As discussed in detail in SIO, this technique 
avoids several of the weaknesses of SPH and AMR schemes while 
it retains their most important advantages. For example, if the mesh 
motion is tied to the gas flow, the results are Galilean-invariant (like 
in SPH), while at the same time a high accuracy for shocks and con- 
tact discontinuities is achieved (like in Eulerian schemes). 

The aim of this paper is to compare the novel cosmological hy- 
drodynamics code AREPO with the widely used SPH code GADGET 
( [Springel|2005l ). In this first paper of a series we give an overview 
of the numerical techniques used for our galaxy formation simula- 
tions, introduce our simulation set, present an analysis of the global 
baryon characteristics, and evaluate the performance and efficiency 
of the new simulation scheme. In one companion paper ( [Keres et aT] 
[2011[ hereafter Paper II), we discuss the properties and statistics of 
individual galaxies and haloes in the simulations. A further com- 
panion paper ( Sij acki et al.|2011^ hereafter Paper III) analyses ide- 
alised test problems to highlight various differences of the two sim- 
ulation schemes and to elucidate how they affect galaxy formation. 

The outline of this paper is as follows. In Section 2, we provide 
a discussion of the initial conditions and numerical details which 
are particularly important for the goal of our comparison project. 
There we also discuss some modifications of the AREPO code that 
were adopted during the course of this work. Section 3 presents first 
results and discusses the large-scale structure and global statistics 
of the baryon content in the simulations at different resolutions. 
We turn to an analysis of the origin of the cooling difference we 
find between the codes in Section 4. In Section 5 we discuss some 
generic issues and problems of SPH. Finally, we give a summary 
of our results and our conclusions in Section 6. In an Appendix, we 
provide data on the mesh geometry, analyse the performance of the 
codes in different parts of the calculations and present scaling tests. 



2 NUMERICAL METHODS AND SIMULATION SET 
2.1 Implemented physics 

Our GADGET and AREPO simulations follow the same physics, 
consisting of a collisionless dark matter fluid and an ideal baryonic 
gas. Both components act as sources of gravity and are evolved 
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in a Newtonian approximation on top of an expanding Friedman- 
Lemaitre background model. We describe the gas dynamics with 
the ordinary inviscid Euler equations, augmented with additional 
terms that account for "extra physics" related to radiative processes 
and star formation. In particular, we account for optically thin ra- 
diative cooling of a primordial mixture of helium and hydrogen, 
with a hydrogen mass fraction of 0.76. We also assume a spatially 
uniform, time-dependent ionising UV background with an ampli- 
tude and time evolution according to Faucher-Gi guere et al.| ( [2009] ). 

As a result of cooling, gas can collapse to high density and 
turn into stars. Both simulation codes describe this process with 
the simple star formation and supernova feedback model intro- 
duced in |Springel & Hernquist| ( |2003a| ), which has been used in 
many SPH simulations in recent years. In this approach, every suf- 
ficiently dense gas particle/cell is treated as a representative re- 
gion of the interstellar medium (ISM) and is assumed to exhibit 
a multiphase structure consisting of cold and hot gas in pressure 
equilibrium. Stars form out of the cold gas and return energy back 
into the medium though supernova explosions. This energy feed- 
back pressurises the multiphase medium, the effect of which is de- 
scribed in terms of an effective equation of state that is imposed 
onto the gas once it has overcome the density threshold for star for- 
mation. The value of the density threshold we use in this work is 
riH = 0.13 cm~^, set such that isolated disk galaxies at z = 
are characterised with a relation between disk surface gas density 
and star formation density that agrees with the observed Kennicutt 
law ( [Kennicutt||1998| ). To reach these densities gas needs to fall 
into dark matter haloes and dissipate energy via radiative cooling 
to eventually form galaxies ( White & Rees 1978, W hite & Frenk| 
|1991j ). Once the gas density is higher than the threshold value, par- 
ticles/cells are eligible to form stars and can be converted into col- 
lisionless stellar particles. We treat this conversion as a stochastic 
process that generates one generation of stellar particles per hydro- 
dynamic resolution element (SPH particle or Voronoi cell, respec- 
tively) at an average rate equal to the star formation rate predicted 
by the subresolution multi-phase model. 

We stress again that for the purposes of our study it is crucial 
to have an identical implementation of this extra physics in both 
codes in order to allow a straightforward comparison of the two 
hydrodynamical schemes. Fortunately, the quasi-Lagrangian nature 
of AREPO typically allows an easy adoption of methods originally 
developed for particle-based schemes like SPH. We could hence 
adopt most of the "extra physics" implemented in GADGET in a 
largely unmodified form in AREPO, helping to ensure that possible 
implementation differences for this physics do not affect our con- 
clusions. We also emphasise that in our present context it is prefer- 
able to employ a relatively simple subresolution model like that of 
[Springel & Hemqulstl l |2003a) ) in order to facilitate a clear assess- 
ment of the differences between hydro solvers. In particular, for 
now we have not included the effects of strong feedback capable of 
inducing galactic winds and outflows. We are thus not attempting 
to solve the galaxy formation problem at this stage, but are instead 
concerned with identifying and understanding issues related to the 
treatment of the hydrodynamics. 

2.2 Simulation Codes 

In the following two subsections we highlight the most relevant fea- 
tures and parameter settings of the codes we used, as well as any 
particular changes we made for this project. We note that full details 
are presented in substantial depth in the corresponding code papers, 
so in the interest of brevity we here restrict ourselves only to those 



aspects directly pertinent to our study. We are also aided by the fact 
that the AREPO and GADGET codes share the same gravity solver, 
something that was not the case in previous comparison studies of 
SPH and AMR (e.g. 'Regan et al. 2007). Consequently, the evolu- 
tion of the collisionless DM component is identical in both codes 
in the limit of vanishing hydrodynamical forces. This similar han- 
dling of self-gravity is important for isolating differences caused 
primarily by the hydro solvers. 

2.2.1 GADGET 

GADGET is a widely used and well-tested SPH-code which em- 
ploys a formulation of SPH that simultaneously conserves energy 
and entropy despite the use of fully adaptive smoothing lengths 
( [Springel & Hernquist|200 2). The gravity calculation is split into 
short- and long-range components, where short-range forces are 
calculated with an hierarchical oct-tree algorithm (^ Barnes & Hut| 
1986 , Hernquist|1987 ) and the Ion g-range forces are evaluated with 
a particle mesh (PM) method (e.g. Hock ney°& Eastwood|| 19811 ). 
Our GADGET runs are based on a default setting of the numerical 
SPH parameters, using 32 neighbours in smoothed estimates and 
an artificial viscosity parameter of a = 1.0, combined with Bal- 
sara's switch ( |Balsara|1995j to reduce the viscosity in the presence 
of strong shear. 

2.2.2 AREPO 

AREPO is a second-order accurate finite volume method that solves 
the Euler equations using piece-wise linear reconstruction and a 
calculation of hydrodynamical fluxes at each cell interface with an 
exact Riemann solver. The basic solution strategy of the code is that 
of the well-known MUSCL-Hancock scheme. What makes AREPO 
unusual is that it employs an unstructured mesh based on a Voronoi 
tessellation using a set of mesh-generating points. Furthermore, the 
mesh is allowed to move freely as a function of time. In our runs, we 
tie the motion of the mesh-generating points to the hydrodynami- 
cal flow, which gives the scheme a quasi-Lagrangian character and 
makes the mesh automatically adaptive. As SIO demonstrated, such 
an approach offers a number of advantages compared with ordinary 
Eulerian or Lagrangian codes, including a reduction of numerical 
diffusivity and advection errors and an improved handling of con- 
tact discontinuities. In the following, we discuss some details and 
modifications of AREPO relevant for our study. 

(i) Mesh regular isation: It is desirable to have a Voronoi 
mesh geometry where the geometric centre of each Voronoi cell 
lies reasonably close to the mesh-generating point of that cell. This 
reduces the size of errors in the linear reconstruction step of the 
MUSCL-Hancock scheme and also limits the rate at which mesh 
faces turn their orientation during the mesh motion. AREPO there- 
fore incorporates a method to regularise the mesh, as described 
in SIO. This scheme uses a modified Lloyd algorithm to drift the 
mesh a bit towards its centroidal configuration at each time step. 
To this end the motion of the mesh-generating points contains an 
additional velocity component for certain cells, designed to move 
a mesh-generating point closer towards the geometric centre of its 
cell. The default setup of AREPO uses the spatial offset between the 
actual location of a vertex and its cell's geometric centre in units 
of the cell's size to decide whether or not this corrective motion 
should be added to the cell's vertex velocity. 

However, a number of test simulations revealed that this 
regularisation scheme can introduce some unwanted side effects. 
Specifically, systematic differences in the average mass of gaseous 
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cells can develop between the star-forming gas and the lower den- 
sity material in the surrounding halo, with the former tending to 
become systematically more massive than the latter. That such a 
tendency exists in principle is to be expected because the Lloyd 
algorithm would ultimately converge to a homogeneous Voronoi 
mesh with cells of equal volume if it applied to every cell centre 
with full intensity all the time. The mesh regularisation motions in 
the default scheme have hence the tendency to move the mesh away 
from the densest regions (but leaving the gas there - in AREPO, the 
gas motion and the mesh motion are independent), such that the 
mass per cell will tend to increase in the densest regions. This is 
somewhat undesirable because ideally we want to have the high- 
est mass resolution in the central parts of galaxies and not in the 
surrounding low density gas. 

There are at least two solutions to this problem (besides dis- 
abling the regularisation scheme altogether). One is to simply make 
the settings of the standard regularisation less aggressive, thereby 
reducing the systematic effects described above. Another possibil- 
ity is to change the criterion that decides when such a mesh correc- 
tion motion should be applied. We have found that the simple crite- 
rion originally implemented in AREPO, based on the displacement 
of a mesh-generating point from the geometric centre of its cell, is 
prone to invoke mesh regularisation drifts even when they are not 
needed. This happens especially in regions with strong density gra- 
dients, where such displacements naturally occur even if the mesh 
geometry is acceptable. As an alternative criterion we have there- 
fore considered the maximum opening angle under which a cell 
face is seen from the mesh-generating point. This identifies prob- 
lematic cell geometries more directly, which are in fact those where 
a point lies close to a wall, or equivalently, where this opening angle 
is large. We find that this opening angle criterion is more tolerant to 
rapid changes in spatial resolution, and is hence less prone to trig- 
gering unwanted mesh-correction motions in the presence of large 
density gradients. Because of this it virtually eliminates the mass 
segregation effect mentioned above. 

For definiteness, in the simulations presented here, we in- 
voke mesh-regularisation motions if the maximum face angle of 
a Voronoi cell (defined as the maximum of ^J Ai/i^ /hi, where Ai 
is the area of a face and hi its distance to the vertex) is larger than 
1.68. In this case we add a velocity component up to half the sound 
speed of the corresponding cell to the vertex velocity, which moves 
the mesh-generating point closer towards the geometric centre of 
the Voronoi cell. Ideally, this scheme should then guarantee reason- 
ably regular cells where the bulk of all cells will have a maximum 
face angle less than 1.68. Our tests confirm that this is the case, 
while at the same time the mass bias between star forming high 
density cells and the lower density cells in the surrounding halo 
is significantly reduced. The face angle scheme applies the mesh- 
correction motions more rarely than the original approach but still 
frequently enough to maintain a regular mesh. In the Appendix, we 
discuss some measurements of the mesh statistics, which also show 
that this new regularisation scheme still ensures that the offsets be- 
tween mesh-generating points and the geometric centres of the cor- 
responding cells remain small, as is desired to minimise errors in 
the linear reconstruction step. 

(ii) Refinement and de-refinement: AREPO is quasi- 
Lagrangian in the sense that the vertices of the Voronoi mesh fol- 
low the flow of the fluid such that an approximately constant mass 
resolution is automatically achieved. The code is not purely La- 
grangian, however, because mass can be exchanged between cells 
in a manner consistent with the equations of motion, in order to 
prevent the mesh from becoming highly distorted. Moreover, the 



code can also make use of re- and derefinement of individual cells 
in order to improve the local resolution beyond that offered by the 
dynamical motion of the mesh or to improve efficiency when high 
resolution is no longer required in certain regions. The method em- 
ployed by AREPO is more general than that used by standard AMR 
methods, which typically employ subgrids to refine a certain sec- 
tion of the parent grid, thereby creating a hierarchy of overlapping 
grid patches. In AREPO, we are not required to use subgrids, but 
can split or merge individual cells, such that a single global mesh 
with a smooth transition from low- to high-resolution parts results. 
The criteria for when a change of the local resolution should be in- 
troduced are very flexible, just as in the AMR technique. Indeed, 
because the initial distribution of mesh generating points and the 
manner in which they are updated in time are arbitrary, AREPO of- 
fers the capability of running in an AMR mode. 

In our cosmological galaxy formation simulations it is desir- 
able to maintain a roughly constant mass resolution. This is be- 
cause in runs with star formation whole gas cells can turn into col- 
lisionless stellar particles, and it is best to create them with ap- 
proximately the same mass to avoid potential two-body heating ef- 
fects among these star particles. A relatively homogeneous mass 
resolution in the gas phase is also warranted to guarantee a more 
proper comparison to the SPH runs, where a fixed mass resolution 
is present by construction. (We note, however, that this fixed mass 
resolution in SPH is a consequence of this method's inability to 
correctly represent the flow on small scales, as we discuss in Sec- 
tion 5.3.) 

We have therefore introduced a new re-/derefinement scheme 
in AREPO that ensures that the mass resolution in the gas and the 
spawned star particles never deviates too much from the value 
of the SPH particle mass in the corresponding GADGET simula- 
tion. Specifically, we "force" cells to have a mass in the range 
1/2 < mceii/'/^target % 2, whcrc wc sct the target gas mass 
^target cqual to thc mean cell mass assuming a homogeneous gas 
distribution in the simulation volume with equal volume cells. This 
corresponds then exactly to the mass of an individual SPH particle 
in the GADGET calculation at the same particle/cell number. We al- 
low only quite "roundish" cells to be refined, specifically we only 
refine a cell if the maximum face angle is smaller than 3.38. This 
protects against introducing large local errors by splitting very ir- 
regular cells (which are usually cells that have just been split in the 
previous timestep). As a result of this constraint our simulations 
may contain at any given time a few cells which are more massive 
than 2 x mtarget- We have also done runs where only cells with a 
sufficiently shallow temperature gradient across them were allowed 
to be derefined, but this did not make any significant difference to 
our results. 

We also slightly modified the star formation implementation 
to take this re-/derefinement scheme into account. Specifically, if a 
star particle is formed in a cell with mass less than 2 x mtarget, 
it inherits the full mass and the cell is removed. Otherwise, a star 
particle is created with a mass equal to mtarget, leaving the rest of 
the mass in the (surviving) cell. Cells that do not contain at least 
1 /4 X mtarget do not produce stellar particles, which prevents very 
low mass collisionless particles from being formed. We note that 
such low mass cells are exceedingly rare, because cells with less 
than 1/2 X mtarget are normally all removed during the derefine- 
ment process. However, since two neighbouring cells will never be 
dissolved in the same timestep (see the discussion in SIO), a cell 
can in rare cases briefly scatter below 1/2 x mtarget and poten- 
tially also attempt to spawn a stellar particle during this time. 

This modification of the star formation implementation guar- 
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antees that star particle masses are bounded by the interval 1/2 < 
m^/mtavget < 2. Our GADGET and AREPO simulations both use 
only one generation of stellar particles, which implies that the total 
number of baryonic resolution elements in the GADGET simula- 
tions is strictly conserved, while in AREPO it can fluctuate slightly 
around the initial value. 

(iii) Dual entropy formalism: As discussed in SIO, finite vol- 
ume hydro solvers can suffer from spurious heating in highly super- 
sonic, cold parts of the flow, where the kinetic energy dominates 
over the thermal energy. Especially at high redshifts in cosmologi- 
cal simulations, this can lead to an incorrect temperature evolution 
in low-density gas as a result of discretisation errors in the advec- 
tion of the kinetic energy. Owing to its quasi-Lagrangian charac- 
ter AREPO is somewhat less prone to this effect than fixed mesh 
techniques, but it can still be present at some level. To circum- 
vent this problem, AREPO can optionally force the entropy to be 
conserved instead of the energy during a timestep. In this case, no 
spurious heating will be introduced. However, entropy is conserved 
only outside of shocks in adiabatic parts of the flow, so a criterion is 
required to decide on a cell-by-cell basis whether energy or entropy 
conservation should be given precedence for a cell's evolution over 
the current timestep. 

As SIO showed, this dual entropy formalism is especially use- 
ful for non-radiative simulations, because here a spurious heating 
cannot be compensated by radiative cooling. Since our simulations 
include radiative cooling due to a primordial gas mixture we ex- 
pect this to be less problematic for our applications. Indeed, ex- 
tensive test simulations showed that our results are not sensitive to 
whether or not the dual entropy formalism is used. We have there- 
fore decided to exclusively use strict energy conservation in all our 
production calculations, without employing the dual entropy for- 
malism. 

(iv) Gravitational softening length: GADGET uses a global 
gravitational softening length for collisional and collisionless par- 
ticles, i.e. for gas and stellar/dark matter particles. AREPO also 
uses a global gravitational softening length for collisionless par- 
ticles (which we set equal to that used in our GADGET simula- 
tions). However, as the code is a finite volume method which uses 
gas cells of variable size to represent the fluid, we have used indi- 
vidual Plummer-equivalent gravitational softening lengths accord- 
ing to eceii = Cfac X Tceii for cach gaseous cell, where rceii = 
(3yceii/47r)^/^ is the cell's fiducial radius assuming the cell vol- 
ume is spread over a sphere, and Cfac = 2.5 is an overall scaling 
factor. In addition, we imposed a lower floor on the gravitational 
softening lengths equal to that of the SPH particles in the corre- 
sponding GADGET simulation. This was done in order to safeguard 
against a potentially higher gravitational resolution in AREPO in 
the densest gas, which may have distorted our comparison. How- 
ever we note that we do not expect any significant difference due to 
the slightly different treatment of the gravitational softening length 
of the gas even if such a floor was not used. Indeed, as described 
in Appendix B, we have repeated lower resolution AREPO simu- 
lations with a fixed softening length for all gas cells and obtained 
nearly identical results. Also, we have repeated simulations with 
adaptive gravitational softening without a lower floor and again ob- 
tained equivalent results. We note that the presence of an effective 
equation of state for highly dense gas is ultimately responsible for 
this insensitivity. Without it, we would get fragmentation of very 
dense gas where the adaptive softening could make a difference 
( |Bate&Burkert|1997| ). 



2.3 Code comparison strategy 

In any comparison between codes as different as GADGET and 
AREPO, a number of different benchmarks can be used to gauge 
their relative performance, and various criteria can be defined to 
align runs that are deemed comparable with each other. In our 
study, we have chosen to compare GADGET and AREPO at match- 
ing mass resolution, using the same initial number of baryonic and 
dark matter resolution elements. This provides a well-defined com- 
parison strategy and is advantageous for a number of reasons: 

(1) First, we can use the same initial conditions for the sim- 
ulations with both codes. This would not be possible if we per- 
formed the runs with different initial mass resolutions. (2) Second, 
our choice makes it straightforward to relate the same objects be- 
tween the different runs (and they are resolved in dark matter in the 
same way). (3) Third, as we discuss below, for the same initial mass 
resolution, the CPU time requirements for GADGET and AREPO 
are similar, at least for cosmological simulations, with the moving 
mesh runs being only some ^ 30% more costly. In many cases, 
the limiting factor for the simulation size that can be achieved is 
the amount of computing time required, so our choice effectively 
enables a comparison for fixed computational resources. 

Of course, in principle one may also attempt a comparison 
at equivalent spatial resolution in the gas between GADGET and 
AREPO, or at the same accuracy in the solution obtained. We have 
not attempted this for the following reasons. First, while the spatial 
resolution that can be achieved for a given number of resolution 
elements is expected to be worse in SPH compared to AREPO, the 
degree to which this matters is likely very problem-dependent. For 
example, for our treatment of star formation, the mass resolution is 
much more important than a precise treatment of fluid discontinu- 
ities. Aligning the spatial resolutions such that discontinuities are 
broadened over the same scale in both codes would lead to dras- 
tically different mass resolutions in the star-forming phases and 
hence likely to large systematic resolution effects in the smallest 
galaxies. 

Second, and even more important in our view, is that the con- 
vergence properties of SPH are not well understood, making a com- 
parison at fixed accuracy a highly problematic undertaking from 
the start. As we discuss at length in Section 5, formal convergence 
in SPH ultimately requires that the number of neighbouring par- 
ticles used in smoothed estimates around a particular location be 
increased as the total number of particles is made larger. This has 
typically not been done in cosmological applications, and is also 
in practice not readily possible due to clumping instabilities (e.g. 
ISchuessler & Sc hmitt 1981). 

Moreover, while SPH is often described as being "La- 
grangian," this is not formally correct because individual fluid el- 
ements, as represented by SPH particles, are not allowed to de- 
form arbitrarily in response to e.g. shearing motions, as we illus- 
trate in Section 5. SPH should, therefore, properly be referred to 
as "pseudo-Lagrangian." The errors that this feature of SPH entails 
depend on the detailed properties of the flow, in addition to the local 
spatial resolution, and consequently cannot be assessed in general. 
In its most basic formulation, SPH may simply lack a formal con- 
vergence condition, making it unclear how a comparison at fixed 
accuracy should be defined. 

Finally, in our study, we have chosen a particular implemen- 
tation of "standard SPH", as implemented by the GADGET code. 
Moreover, by choosing GADGET for our SPH simulations, we are 
able to use the same gravity solver and the same physics in our 
comparisons between SPH and AREPO, which might not be possi- 
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Name 


Code 


Boxsize [h~^ Mpc^] 


hydro elements 


DM particles 


^target/SPH[^ ^ M©] 


mDM[^ ^M©] 


e[/i-ikpc] 


A_L20n512 


AREPO 


20^ 


512^ 


5123 


7.444 X 10^ 


3.722 X 10^ 


1 


G_L20n512 


GADGET 


20^ 


5123 


5123 


7.444 X 10^ 


3.722 X 10^ 


1 


A_L20n256 


AREPO 


20^ 


2563 


2563 


5.955 X 10^ 


2.977 X 10'^ 


2 


G_L20n256 


GADGET 


20^ 


2563 


2563 


5.955 X 10^ 


2.977 X 10'^ 


2 


A_L20nl28 


AREPO 


20^ 


1283 


1283 


4.764 X 10'^ 


2.382 X 10^ 


4 


G_L20nl28 


GADGET 


20^ 


1283 


1283 


4.764 X 10'^ 


2.382 X 10^ 


4 



Table 1. Names and basic setup of our different simulations. All calculations were performed in a periodic box with sidelength 20 Mpc. The number of 
baryonic resolution elements (SPH particles/Voronoi cells, and star particles) is fixed in the GADGET runs, but can vary due to re- and derefinement operations 
in the AREPO simulations. We enforce a refinement scheme that keeps cell masses close to a target gas mass mtarget , which we set to the SPH particle mass of 
the GADGET run at the same particle/cell number. The comoving Plummer equivalent gravitational softening length e is constant in GADGET, but adaptive 
(with a floor) for gas cells in AREPO. Additional AREPO simulations with a fixed gravitational softening length yielded essentially equivalent results, as 
described in Appendix B. 



ble otherwise. This enables us to isolate differences in runs that owe 
primarily to the differences in the hydro solvers between GADGET 
and AREPO. As we discuss in Section 5, various modifications have 
been proposed to the basic structure of SPH in order to improve its 
reliability under some circumstances. It is, of course, possible that 
we would have obtained somewhat different results had we em- 
ployed such formulations of SPH. However, we believe that many 
limitations of SPH are generic and are not specific to GADGET. 
Indeed, as we discuss in Section 5, it is difficult to see how SPH 
could be modified to entirely eliminate, for example, the sources 
of error that owe to its pseudo-Lagrangian nature without radical 
modifications to the underlying algorithm. 

We also stress again that the primary goal of our comparison 
is not to arrive at the best fit to observational data, or to achieve the 
highest possible resolution for single galaxy models. While we ac- 
knowledge that important strides have been made recently in study- 
ing disk formation numerically using zoom-in procedures in cos- 
mological SPH simulations (e.g. Agertz et al.|2011[ |Guedes et al.^ 
[2011. ) , we do not think these successes provide much evidence for 
the numerical reliability of SPH, although this misconception ap- 
pears widespread. Our ultimate aim is not merely to model sin- 
gle objects, but entire populations of galaxies so that we can sta- 
tistically constrain uncertain processes associated with star forma- 
tion and feedback by using, e.g., observations of the distribution of 
galaxies with respect to their Hubble type. This requires the iden- 
tification of accurate and computationally efficient techniques that 
are best suitable for the task. In particular, our strategy is designed 
to detect systematic inaccuracies in current simulation techniques 
that may easily be concealed by the adjustment of heuristic feed- 
back parameters to match a certain observation, as it is commonly 
done, while at the same time these effects may have serious conse- 
quences in other places. 



2.4 Simulation Suite 

For our simulation set we adopt the cosmological parameters 
Q.mQ 0.27, r^Ao 0.73, Qbo 0.045, as 0.8, Us 0.95 
andifo = 100/ikms-^Mpc-^ = 70kms-^Mpc-^ Qi = 0.7). 
These parameters are consistent with the most recent WMAP-7 
measurements ( Komats u et al.|20l Tl as well as with a host of other 
cosmological constraints. We create a realisation of this cosmol- 
ogy in a periodic box with a sidelength 20 Mpc, at three dif- 
ferent mass resolutions corresponding to 2 x 128^, 2 x 256^ and 
2 X 512^ particles/cells. A comoving gravitational softening length 



of 4 kpc, 2h~^ kpc or 1 kpc, respectively, is used for the 
dark matter, and for the gas particles in the GADGET runs. For 
AREPO, we start initially with the same number of cells as there 
are particles in the corresponding SPH run, but we use an adap- 
tive gravitational softening with a floor as described earlier. Note 
that due to re- and derefinement the number of baryonic resolution 
elements (cells plus star particles) is not strictly conserved during 
the course of a simulation but can fluctuate slightly around the ini- 
tial number. Initial conditions were generated at 2; = 99 based on 
the power spectrum fit of jEisenstein & Hu| ([1999 ), with gas par- 
ticles/cells added to the initial conditions by splitting each origi- 
nal particle into a dark matter and gas particle/cell pair, displacing 
them with respect to each other such that two interleaved grids are 
formed, keeping the centre-of-mass of each pair fixed. 

All our simulations have been evolved until redshift z — ^ 
even though the fundamental mode of our small box becomes 
mildly non-linear at low redshift. However, our primary goal here is 
not to obtain quantitatively accurate large-scale statistics, but rather 
to compare the properties of galaxies formed by two different hy- 
drodynamical schemes. As we start all simulations with the same 
phases, this relative comparison is unaffected by box-size effects 
down to 2; = 0. To easily refer to the different runs in our simulation 
suite we use the following naming convention. A_L20nX denotes 
AREPO runs (indicated by the leading 'A'), where X = 128, 256, 
or 512 encodes the numerical resolution and the tag 'L20' stands 
for the box size of L = 20 Mpc. Likewise, GX20nX refers 
to our GADGET runs (indicated by the 'G'). In Table [T] we pro- 
vide an overview of our simulation suite and list its most important 
parameters. 



3 RESULTS FOR GLOBAL BARYON STATISTICS 
3.1 Density and temperature maps 

To obtain a first impression of the simulations, it is instructive to 
examine maps of the density and temperature fields. In Fig. [T] we 
show projections of the gas density and temperature distributions 
for our highest resolution simulations AX20n512 and G_L20n512 
at 2; = 2. The left panel gives the AREPO results and the right 
panel those obtained with GADGET. In both cases, the image on 
top depicts a large part of the full box, while the zoom images be- 
low show two successive zooms onto a disk galaxy (as indicated by 
the white squares). In these enlarged images, the left half depicts 
the temperature field while the right half gives the density field. All 
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Figure 1. Projected gas density and temperature maps for the highest resolution AREPO (left) and GADGET (right) simulations (2 x 512^ particles/cells) 
at redshift z = 2. All sHces have a thickness of2h~^ Mpc (comoving). The upper two panels show a large part of the full simulation volume. This region 
extends 20 Mpc in the x-direction and contains one of the largest filaments in the box. The other eight panels form a zoom-in onto one object. Each 
zoomed region is indicated by a white square in the previous step, and the corresponding comoving length scales are indicated by the small white bars in the 
lower left of each image. On the left side of the zoom-images we show the temperature, and on the right the density fields. The colour maps at the different 
zoom levels change, but they are the same for both codes. At the highest zoom-in factor in the lower panels, a disk galaxy is visible. The disk is very extended 
in AREPO, and also features a bar. The SPH calculation produces a significantly smaller gas disk in the same halo, and the stellar bar is smaller compared to 
the moving-mesh simulation. 



slices have a thickness of 2 Mpc and their extent in the image 
plane is specified by the scale-bars included in the images. Note 
that we did not centre the two image sequences individually onto 
the galaxies, so the zoom sequence also illustrates the typical mag- 
nitude of coordinate offsets that can develop between the different 
simulations. 

Inspection of the maps in Fig.[T]demonstrates that both codes 
produce essentially identical results on large scales. This is rea- 
sonable since we do not expect large-scale changes induced by a 
different hydrodynamical scheme. However, already at the inter- 
mediate zoom level one can identify some significant differences. 
The distribution of hot gas is clearly quite different, with GADGET 
showing a more extended hot atmosphere compared to AREPO. On 
still smaller scales, the differences between the two hydrodynam- 
ical codes become even more pronounced: whereas AREPO pro- 
duces an extended disk, GADGET forms a significantly smaller and 
clumpier disk. The difference can be best appreciated in the tem- 
perature map, where the cold gas disk stands out clearly as a large 
blue region in the moving mesh calculation. AREPO forms a spi- 
ral disk with a central bar, which is visible in the density map. For 
a more detailed analysis of the structural properties of this galaxy 



we refer the reader to Paper II of this series. We note that although 
some spiral features are visible in the disk, these are partly seeded 
by Poisson noise in the potential due to the limited number of DM 
particles in the halo (see D'Onghia et al.^2011^ for details of this 
process), and have hence only limited physical significance. 

In order to demonstrate that differences like those seen in 
Fig. [T] do not appear only in unusual cases, we show in Fig. [2] gas 
density projections of the central galaxies of the 24 most massive 
haloes at z = 2 of A_L20n512 (top panels), and G_L20n512 (bot- 
tom panels), along random projection directions. We did not im- 
pose any constraint on the galaxy selection other than halo mass; as 
a result, we can also have galaxies in our sample that are disturbed 
or undergo a merger, such as the system labelled 'galaxy-id 5'. The 
object 'galaxy-id 8' of Fig. [2] corresponds to the galaxy shown in 
Fig.[T] but happens to be oriented edge-on here. The random orien- 
tations of the galaxy set give an impression of how a population of 
galaxies would look like in these simulations. Clearly, the gas dis- 
tributions of most of the galaxies in Fig. |2] are more extended and 
more reminiscent of spiral galaxies in AREPO than in GADGET. 

Inspection of animated sequences from the AREPO simula- 
tions reveals a population of interacting and merging systems. 
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Figure 2. Gas density projections of the central galaxies in the 24 most massive haloes in A_L20n512 at z = 2, together with the corresponding galaxies in 
the G_L20n512 simulation (top four rows: AREPO, bottom four rows: GADGET). The small white bars indicate 5 kpc (physical). We project along random 
directions to show various viewing angles, similar to a real galaxy field of view. The galaxy with number 8 is the object shown in the zoomed-in region of 
Fig. ^ but here now shown more edge-on. Essentially in all of the 24 cases the AREPO gas distributions are more extended than those of the equivalent 
GADGET galaxies. This demonstrates that the galaxy shown in Fig.[T]is not an exception, but rather a typical case in our simulations. 
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Figure 3. Stellar density projections of the central galaxies in the 24 most massive haloes in A_L20n512 at z = 2, together with the corresponding galaxies 
in the G_L20n512 simulation (top four rows: AREPO, bottom four rows: GADGET). The small white bars indicate 2.5 kpc (physical). Density is encoded 
as intensity, whereas the colour scale (blue-red) indicates the stellar age (young-old). Galaxies are shown from the same viewing angle as in Fig.|2] but with 
reduced sidelength. The differences in stellar radii are not as large as those found for the gas distribution. In most cases the stellar distribution is more disky in 
AREPO compared to the galaxies in the SPH runs, and the stellar population in the AREPO simulations is younger making the galaxies on average bluer. 
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Figure 4. Gas density projections of 32 galaxies at z = randomly selected within the halo mass range from-^ 1 X 10^1 h-^ M© to ~ 6 x 10^^ h'^ M©. 
The top four rows show the AREPO galaxies and the bottom four rows the matched GADGET galaxies. The galaxies are shown face-on for better visual 
inspection of the disk structure. The small white bars indicate 10 kpc (physical). The difference in the disk morphology is very striking and follows the same 
trend found at z = 2 in Fig.|2] AREPO gas disks are significantly more extended than those in the SPH simulations. 



Many of them show thin bridges and tails, as expected if the galax- 
ies are rotationally supported disks (e.g. [Toomre & Too mre 1972, 



D'Onghia et al.|20 1of| 



The differences in the gas properties are also reflected in 
the corresponding stellar distributions of these galaxies, which we 



^ Volume-rendering movies of AREPO galaxies (at redshift z = \ and 
z = 2) and high-resolution images are available for download at the website 
http://www.cfa.harvard.edu/itc/research/movingmeshcosmology. 



show in Fig. [3] Although the stellar distributions are more similar 
between the two numerical schemes, in most cases one still no- 
tices that AREPO 's disks are slightly larger and appear typically 
more disky, an impression that is also supported by the quantitative 
analysis of galaxy properties that we present in Paper 11. Further- 
more, the stellar populations of the AREPO galaxies are bluer, i.e. 
younger, than those found in the GADGET run. 

In Fig. |4] and Fig. [5] we show face-on projections of 32 other 
galaxies at z = with Milky Way-like DM halo masses in the 
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Figure 5. Stellar density projections of the 32 galaxies shown in Fig.|4] The small white bars indicate 10 kpc (physical). As in Fig.[3]the density is encoded 
as intensity, whereas the colour scale (blue-red) encodes the stellar age (young-old). The viewing angle is the same as in Fig.|4| Stellar radii differ not as much 
as the gas disk scale length found in Fig. [4] As in Fig.[3]the GADGET galaxies are slightly redder than the galaxies found in the AREPO simulations. 



mass range from - 1 x 10^^ M© to - 6 x 10^^ M©. 
The gas disk structure shown in Fig. |4] follows the trend already 
seen at 2; = 2, and demonstrates that AREPO produces significantly 
larger disks at all times and at all halo masses compared to the GAD- 
GET simulations. Fig. [5] demonstrates that the stellar disks are also 
slightly larger, but not as significant as the gas disks. But similar to 
the z = 2 result, the AREPO stellar disk populations are typically 
younger than those found in GADGET. 

A self-evident conclusion from these visual comparisons of 
Figs. [2] |3] and Figs. |4] [5] is that the morphology of forming galax- 



ies in cosmological hydrodynamics simulations can be strongly af- 
fected by the underlying hydrodynamics solver. SPH has signifi- 
cantly more problems producing disk-like galaxies than the moving 
mesh approach for a similar computational cost. Note however that 
we do not expect our simulations to produce a completely realistic 
2; = 2 galaxy population, because we here refrained from includ- 
ing strong feedback capable of producing galactic winds and out- 
flows. However, since both simulations followed exactly the same 
physics, we expect this qualitative difference to be present in sim- 
ulations with stronger feedback as well. 
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Figure 6. Mass functions of FoF groups at 2; = 2 (upper panels) and z = ^ (lower panels). Left column: dark matter, middle column: gas, right column: stars. 
Vertical lines indicate the 32 dark matter particle mass limit. The dark matter mass functions agree and converge very well, demonstrating that the gravitational 
evolution is indeed treated on an equal footing, whereas gas and stellar mass functions deviate due to the different hydro-solvers. Towards lower redshifts there 
are typically fewer massive gas systems and more massive stellar systems in AREPO compared to GADGET. 



3.2 Mass functions 

As we emphasised earlier, AREPO and GADGET use the same grav- 
ity solver based on a high-resolution Tree-PM scheme. Because 
the gravitational field is largely dominated by the collisionless DM 
component, we expect that the DM distribution in both simulations 
should hence be similar, at least on scales where baryonic effects 
do not change the structure of DM haloes. We explicitly verify this 
expectation in the left column of Fig. [6] where we compare the 
DM halo mass function of friends-of-friends (FoF) groups with a 
linking length of 0.2 of the mean particle separation down to a par- 
ticle number limit of 32 particles at z = and 2; = 2. The codes 
show excellent agreement in the DM halo mass functions at both 
redshifts. Also, the convergence for the different resolutions is in 
both cases very good. We note that previous comparisons of hy- 
drodynamical mesh codes and high resolution tree codes ( |0'Shea| 
|et al. 12005) [Heitmann et al.|2008| ) have shown a deficit of low-mass 
haloes in the mesh-based hydrodynamical codes, an effect that has 
its origin in their AMR-based gravity solver, which typically does 
not refine small DM haloes sufficiently early. As evidenced here, 
AREPO does not have this problem. 

The other panels of Fig.[6]count haloes in terms of their bary- 
onic content, separately for gas and stars. In order to relate baryons 
to dark matter haloes, we determine for each stellar and gaseous 
cell/particle the closest dark matter particle, and then associate the 
baryonic mass element with the FoF group this DM particle resides 



in (see |Dolag et al.|2009l for more details). The middle column of 
Fig. [6] shows the cumulative halo gas mass functions at redshifts 
z — 2 and z = 0. While both codes converge equally well for the 
gas mass functions as a function of resolution, they do not agree 
on the result. At z = 2, AREPO has more objects of a given gas 
mass than GADGET over nearly the full range of halo masses. This 
behaviour changes towards lower redshifts, where fewer very gas- 
rich objects are found in AREPO. We will show below that this has 
to do with more efficient cooling and a higher star formation rate in 
the mesh code in massive haloes at late times. 

The right column in Fig. [6] gives the mass functions for the 
stellar component of haloes, and here the situation is slightly dif- 
ferent. At 2; = 2, we find fewer objects with low stellar mass in 
AREPO compared to GADGET, but the situation reverses at the high 
mass end where we find AREPO has more haloes at a given stellar 
mass. This behaviour changes somewhat towards z = 0, where the 
low mass end agrees now very well between the two codes. How- 
ever, there are still more stellar systems of high mass in AREPO, 
and this difference becomes more pronounced and extends over a 
larger range of masses. Again, this can be attributed to the more 
efficient star formation at late times in massive systems in AREPO 
that we analyse in detail below. Since the baryonic mass contributes 
only a small amount to the total mass of the entire box, these dif- 
ferences in the baryonic mass functions are neither imprinted in 
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Figure 7. Gas density-temperature phase diagrams of AREPO and GADGET at the highest resolution (top: z = 2; bottom: z = 0; left: AREPO; right: 
GADGET). Each bin of the maps contains the total gas mass at the corresponding density and temperature. The different characteristic phases are populated 
similarly in the different runs, showing that the overall global gas properties agree. But there are also some visible differences between the SPH and moving 
mesh calculations. AREPO typically produces more high temperature gas at large densities, but the overall amount of hot gas is reduced, which can be seen 
by the more extended yellow region in GADGET at high temperatures. The dashed lines and numbers indicate the gas cuts used for quantifying the mass and 
volume fractions in Fig.[8]and Fig.|9] respectively. 



the DM halo mass functions nor visible in the large-scale structure 
shown in Fig.^ 

3.3 Gas phases 

Because we use two completely different hydro solvers, we may 
expect some deviations in the breakdown of various gas phases. A 
first simple way to obtain an overview of the global state of the gas 
in the simulation volume is the construction of density-temperature 
phase-space diagrams. We show in Fig. [7] mass-weighted p-T 
phase-space diagrams of the simulations at the highest resolution 



for z = 2 (top row) and z = (bottom row). The left and right 
columns show results for AREPO and GADGET, respectively. We 
can readily identify a number of well-known features in the gas 
phase- space ( [Dave et al.|2001| ), which show up in both runs. The 
narrow ridge of gas with an upward slope and p/phav < 10 and 
T < 10^ K consists of low density, highly photo-ionised gas in the 
intergalactic medium (IGM). The tight relation between tempera- 
ture and density in this regime is maintained by the competition 
between adiabatic expansion cooling and photo-ionisation heating. 
The plume of gas with p/pbar 10 - 10^ and T - 10^ - lO^K 
is comprised on the other hand of shock-heated gas in virialised 
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Figure 8. Fraction of gas mass in various phases; i.e. the different panels correspond to the mass fraction in the parts of the phase diagrams in Fig. [7] We 
do not include stellar mass in this plot. The four different gas selections add up to the total gas mass in the simulation volume. Neither code shows very 
good convergence in all of these fractions. This is true both for the SPH and the moving-mesh calculations. Nevertheless, the different mass fractions do not 
vary dramatically between the various resolutions, and there are some characteristic and systematic differences between GADGET and AREPO. GADGET 
produces more low density hot gas and AREPO shows a larger fraction of high density gas (p/pbar > 10^ K). This can already be seen from Fig.]?] by 
comparing the plume of hot particles/cells. Also, the fraction of mass in the warm to hot intergalactic medium (top right) is lower in AREPO compared to 
GADGET. We note that the redshift range in the lower left panel is reduced compared to the other panels. The inset in the upper right panel shows the fraction 
of all four main panels for the highest resolution runs together. The numbers indicate the cut they belong to. These cuts and numbers are also shown in Fig.^ 



haloes and, at the lower density end, in and around filaments. The 
narrow, downward sloping locus of gas at p/pbar > 10^ and 
T = 10^ K represents radiatively cooled, dense gas in galaxies. 
The cooling times at these densities are short, so that gas remains 
close to its equilibrium temperature where photo-ionisation heating 
balances radiative cooling. This temperature is a slowly decreasing 
function of density and lies close to 10^ K. Finally, once the gas 
becomes very dense and reaches the star formation threshold, we 
describe the gas by an effective equation of state representing the 
mean thermal energy density of a two-phase medium of hot and 
cold gas. This effective equation of state results in the upward slop- 
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ing high density line of gas, which represents the ISM in galax- 
ies. The pressurisation by supernova feedback in our subresolution 
model prevents this gas from fragmenting under self-gravity. 

As Fig.[7]demonstrates, AREPO and GADGET lead to a qualita- 
tively very similar p — T phase- space diagram. This demonstrates 
that the properties of the gas distributions are overall quite com- 
parable in both codes. However, although there is general broad 
agreement between the simulations, there are also some striking 
differences. First of all, the distribution of hot gas extends to higher 
densities in the AREPO runs compared to the GADGET simulations; 
i.e. there is more hot gas at higher densities in the AREPO Simula- 
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Figure 9. Fraction of gas volume in the same gas phases as in Fig.jS] For AREPO we naturally use the volumes of Voronoi cells to calculate these fractions, 
and for the GADGET (SPH) simulations we assigned the specific volume mspn/psPH to each SPH particle. We note that gas volume fractions are not very 
well defined in SPH, because the scheme is not volume conserving, i.e. the sum over mspH /psph does not yield in the correct total gas volume. We therefore 
take as the total volume always the sum of all specific volumes. The differences between the two hydro-schemes are qualitatively similar to those found in 
Fig. [8] especially the hot gas phase shows a very different volume fraction. We note that the redshift range in the lower left panel is reduced compared to the 
other panels. The inset in the upper right panel shows the fraction of all four main panels for the highest resolution runs together. The numbers indicate the cut 
they belong to. These cuts and numbers are also shown in Fig.^ 



tions. We note however that in terms of total mass this is not very 
significant. More important, GADGET appears to have more hot gas 
in general, which can be inferred from the slightly more extended 
yellow region in the phase diagrams. This is consistent with the 
temperature structure in the surroundings of the galaxy shown in 
Fig. [T] The AREPO runs also exhibit a more pronounced cooling 
feature around T ~ 10^^ K, which corresponds to a local mini- 
mum of the cooling curve between the two line peaks of hydrogen 
and helium. Although this feature is readily visible in the phase- 
space diagrams of the AREPO runs, the actual gas mass populating 
that feature is very small. The faint stripe-like features seen in the 
top-left and bottom-left panels of Fig. 7 below the effective equa- 



tion of state arise from numerical inaccuracies in the temperature 
evolution of some cells around star-formation sites. Here the tem- 
perature can sometimes scatter to very low values, which is one in- 
carnation of the accuracy problems associated with supersonic cold 
fluid motions. In our multi-phase model, the temperature of these 
cells is then relaxed back to the effective equation of state temper- 
ature Teff on a timescale th given by equation 12 of of | Springe!] 
|& Hernquist| ( |2003a| . The simulation timestep At of these cells is 
small against th, such that the temperature of the cell at the end 
of a relaxation step is T' ^ Tef^At/rh. Because Teff/r/^ oc p^'^ 
(see |Springel & Hernquist|2003a| ) and due to the power-of-two hi- 
erarchy of possible values for At, a set of parallel stripes of cells 
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Figure 10. Density probability distribution function of the gas in the simulation volume in units of hydrogen number density, assuming a hydrogen mass 
fraction of 0.76. The left panel shows all gas particles/cells, whereas the right panel shows only those with active star formation. In our star formation model 
the density threshold for the onset of star formation is set to nn = 0.13 cm~^, and therefore the right hand panel occupies only this high density region. 
Interestingly, GADGET shows at all resolutions a significantly reduced gas population slightly below the star formation density threshold. This gas is about 
to overcome the density threshold for star formation if it contracts a bit more via cooling. The rest of the hydrogen density distributions look rather similar 
for the two simulation schemes. The reason for the difference slightly below the threshold lies in the more extended disk in AREPO. This gas fills the "gap" 
present in the SPH calculations. 



spaced by a factor of 2 results. The mass in these cells is however 
negligibly small, so that this effect has no influence on the dynam- 
ics. 

To quantify the gas mass content in the different regions of 
the p-T phase- space diagrams in Fig. [7] we introduce the follow- 
ing cuts in the p-T plane: (1) diffuse cool gas with p/pbar < 10^ 
and T < 10^ K, (2) diffuse warm gas with p/ phav < 10^ and 
10^ K < T < 10'^ K, (3) diffuse hot gas with p/pbar < 10^ 
and T > lO'^K, and finally, (4) dense gas with p/ph^r > 10^. 
In Fig. [8] we first show the different gas mass fractions of each 
of these four phases. All runs show the behaviour we anticipated 
earlier; i.e. there is the general trend that the cold gas fraction de- 
creases and the warm/hot gas fraction increases with time. The hot 
gas fraction (lower left panel) increases due to shock heating in 
haloes. Interestingly the details vary quite strongly between the dif- 
ferent numerical schemes. The GADGET runs typically show more 
hot and warm gas, and accordingly produce less dense gas. Espe- 
cially the mass fractions in the hot phase are very different between 
the two schemes: for most of the time the mass difference here is 
larger than one order of magnitude. We will show below that these 
two findings are also consistent with the results for the star forma- 
tion histories of the different runs. We note that the same trends are 
also found for the lower resolution simulations, but the quantitative 
numbers differ slightly. Although not fully converged, there is more 
very dense gas (lower right panel) in AREPO than in GADGET at all 
resolutions, and also the amount of cold gas (upper left panel) is 
larger in AREPO. These results confirm the visual impression ob- 
tained from Fig. [T] where we also saw that the hot atmosphere is 
smaller in AREPO, but the cold gas disk is larger. 

Next, we consider in Fig. [9] the corresponding volume frac- 
tions of these gas phases. The volume fractions of the GADGET 
runs are based on the specific volume of individual SPH particles: 



VspH = mspu/ pspu - However, SPH is not volume-conserving in 
the sense that the sum of the specific volumes of all SPH particles 
will not sum up to the total simulation volume. In finite-volume 
methods like the one used by AREPO, this does not occur; all the 
cell volumes add up to the correct simulation volume. For the com- 
parisons of the volume fractions we therefore do not calculate vol- 
ume fractions with respect to the total simulation box volume, but 
with respect to the sum of all individual particle/cell volumes. This 
does not influence the volume fractions of AREPO, but changes the 
SPH volume fractions of GADGET, which would otherwise not add 
up to unity. The general trends we find for the volume fractions in 
Fig. [9] are similar to those for the mass fractions in Fig. [8] How- 
ever, we can now clearly see that the hot atmospheres are not only 
strongly reduced in mass in AREPO, but also that their volume frac- 
tion is significantly smaller compared to GADGET. This is in agree- 
ment with the qualitative results shown in Fig. [T] and Fig. [2] The 
same is true for the volume fraction of the cold gas, which is how- 
ever not as well-converged. 

To focus more on the density structure, we marginalise Fig.|7] 
over temperature and express the densities as physical hydrogen 
number densities. We hence derive the hydrogen number density 
probability distribution functions for the full simulation volume, 
which are shown in Fig.[To| The left panel accounts for all gas parti- 
cles/cells, whereas the right panel shows only those with active star 
formation. In our star formation model, the density threshold for 
the onset of star formation is set to nu = 0.13 cm~^, and therefore 
the right hand panel occupies only this high density region. Inter- 
estingly, GADGET shows at all resolutions a significantly reduced 
gas fraction slightly below the star formation density threshold. We 
checked that the reduction of the gas mass in this region is caused 
by the more extended disks in AREPO; i.e. the additional gas found 
in AREPO at these densities stems from disk material. Presumably, 
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Figure 11. Star formation rates per unit volume as a function of lookback time (top left) and redshift (bottom left). The panels on the right show the 
corresponding integrated star formation rates as a function of lookback time (top right) and redshift (bottom right). The inset in the upper left panel shows the 
relative deviations from the inferred mean SFR. Although the AREPO star formation rates do not converge as well as the SPH results obtained with GADGET, 
AREPO shows at all resolution levels a significantly higher star formation rate at late times. The high redshift star formation peak agrees in amplitude quite 
well between the different codes if we focus only on the highest resolution (L20n512). 



the 'gap' phenomenon found in SPH across strong contact discon- 
tinuities ( [Agertz et aL||2007| ), such as the one we have here, also 
contributes to the different breakdown in that density range. The 
other parts of the density distributions look rather similar for the 
two simulation methods, however. We verified that the gas slightly 
below the threshold is predominantly responsible for the more ex- 
tended disks in AREPO. This gas effectively fills the gap which is 
present in SPH. The galaxies in GADGET typically lead to a peak 
in the density PDF above the star formation threshold instead of a 
more plateau-like feature due to the extended gas disks in AREPO. 
Therefore, the difference in the density PDF is a reflection of signif- 
icantly different galaxy radii for the different hydro-schemes (see 
also Paper II). 



3.4 Cosmic star formation history 

An important prediction of our simulations is the global star for- 
mation history (SFH), which encodes key information about the 
overall efficiency of the galaxy formation process. The SFHs of our 
different simulation runs are presented in Fig.[TT] The top left panel 
shows the SFH as a function of lookback time, whereas the lower 
left panel plots it on a logarithmic scale as a function of redshift. 
The corresponding integrated star formation histories are shown in 
the right panels. The most obvious trend with increasing resolu- 
tion is that more high-^; star formation in small haloes is resolved 
( [Springel & Hernquist||2003"b| . This is simply due to the better 
mass and spatial resolution for smaller haloes, which are in part not 
even seen in the coarser simulations. This trend with resolution also 
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Figure 12. Star formation rates as a function of FoF group mass for the highest resolution simulations. Solid lines show the median values and the shaded 
regions represent the 1 — 99% range of the distribution. For both cases and for all redshifts shown here the scatter is rather small. Towards lower redshifts a 
gradual trend shows up: both schemes start to deviate strongly towards the high mass end. At 2; = there is nearly perfect overlap of the star formation rates 
below ~ 10^^ but FoF groups above this mass form significantly more stars in the AREPO simulation compared to the GADGET run. Interestingly, 

this difference at the high mass end starts to show up first at around z = ?>. This coincides with the time that the global star formation rates deviate, as can be 
seen from the lower left panel of Fig. |ll| The massive end of the halo population is hence mainly responsible for the higher star formation rate in the moving 
mesh calculations at late times. The insets show specific star formations rates, where we divided the median curve of the main panel by the stellar mass of 
the group. 



shifts the peak in the star formation rate density to higher redshift 
with increasing resolution. Note however that there is little cosmic 
time at these high redshifts, so that the mass of stars formed there is 
small compared to the present stellar density. Once the simulations 
resolve all haloes with virial temperature of 10^ K and above (ne- 
cessitating a dark matter particle resolution of about 10^ M©), we 
expect however that the trend with resolution will stop and all ordi- 
nary star formation will be resolved (apart from additional Pop-III 
star formation, which is not included in our models). 

After the high-redshift peak of star formation, the star forma- 
tion rates decline and converge particularly well at lower redshifts 



for the SPH-based GADGET simulations. In this regime, most of 
the star formation is contributed by higher mass haloes ( Springel] 
|& Hernquist|2003bt , for which our subresolution model produces 
converged results already at moderate resolution. This excellent 
convergence does not quite carry over to the moving mesh calcula- 
tions with AREPO, where all resolutions slightly vary in their late 
time star formation rates (for further explanation see Section 4.3 
and Paper III). However, the mesh-based results are still very close 
to one other, and more importantly, they lie significantly higher 
than the SPH simulations, which is clearly visible from the inset 
in the upper left panel of Fig. [TT] showing the ratio of the individ- 
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Figure 13. Distribution of star formation times (redshifts) for the highest 
resolution AREPO and GADGET simulations. The lines show the distribu- 
tion at 2; = 0, z = 2 and 2; = 3 as indicated. At higher redshift {z = 3) 
the two curves agree reasonably well in shape. But at 2; = 2 there is already 
a trend visible, because the AREPO distribution is biased towards slightly 
younger stars. This effect becomes even more pronounced towards lower 
redshift and is clearly visible in the distribution at 2; = 0. At that time the 
AREPO result is significantly different from GADGET and shows a sub- 
stantially larger fraction of young stars. The reason for this is the larger 
overall star formation rate at late times, as shown in Fig.^^ The different 
stellar age distributions are also responsible for the bluer appearance of the 
stellar populations of the AREPO galaxies shown in Fig.|3]and Fig.|5] 
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Figure 14. Summed net cooling rate (Yl — Hi)) of the gas for the 

different simulations, where Ci and Hi are the cooling and heating rates 
per unit mass. We set the net cooling to zero for particles/cells that are 
subject to net heating. The rates of all AREPO runs are higher than those 
of the GADGET simulations at late times. This is directly related to the 
higher SFRs of all AREPO runs compared to GADGET, as seen at late 
times in Fig.[TT] At high redshift, the cooling rates of AREPO are lower than 
those of GADGET for the intermediate and the lowest resolution. This is a 
consequence of the more compact disks in GADGET at these high redshifts. 
This cooling rate difference decreases once we reach the highest resolution. 
In that case the cooling rates of GADGET and AREPO agree better at high 
redshifts, but still show quite large differences at low redshifts, which also 
produces the different SFR history. 



ual SFHs to the mean. We emphasise once more that this systematic 
difference is a result only of the different hydrodynamical schemes, 
since both codes use the same star formation model and calculate 
identical star formation rates for a given gas density. There is a mi- 
nor technical difference in how stellar particles are created, but this 
does not affect the outcomes, as we have explicitly checked in nu- 
merous test calculations of isolated haloes and galaxies (see Paper 
III). The latter do show very good agreement in the star formation 
rates between GADGET and AREPO, so the difference we find in 
Fig.[TT]must have a cosmological origin. 

To understand better whether this increased star formation rate 
affects all halo masses or whether it is dominated by a certain halo 
population, we plot in Fig.[T2]the star formation rates as a function 
of FoF group mass for our highest resolution simulations. Solid 
lines in Fig. [12] represent the median values and shaded regions 
show the 1 — 99% range of the distribution. The insets show the spe- 
cific star formation rates. In both cases, and for all redshifts shown, 
the scatter around the median relation is rather small. At high red- 
shift (z — 4), the differences between the SPH and moving-mesh 
calculations mainly occur at the intermediate and low mass end, 
where GADGET produces more stars than AREPO. This is in agree- 
ment with the general trend found for the global SFH, where GAD- 
GET shows a slightly higher star formation rate at high redshift. We 
note that the higher specific star formation rates in AREPO at late 
times are due to a significantly more efficient cooling. 

We point out that one reason why GADGET shows at high-z a 
larger star formation rate (SFR) at a given resolution than AREPO is 



related to the fact that already at this early time the disks in AREPO 
are slightly more extended and hence have a lower gas surface den- 
sity (as demonstrated in Paper II). Therefore, the star formation rate 
is lower compared to the denser, more blob-like galaxies formed in 
the SPH calculations. We note that this discrepancy between GAD- 
GET and AREPO decreases once the resolution in SPH becomes 
high enough, but it still induces a noticeable delay in AREPO 's SFR- 
peak compared to GADGET, although the peak amplitudes agree 
quite well. 

Towards lower redshifts, a gradual trend of a different nature 
appears: while AREPO and GADGET start to agree better at low 
masses, the schemes deviate more and more strongly towards the 
high mass end. At z = 0, there is a nearly perfect overlap of the 
star formation rates below ^ lO^^/i^^M©, but FoF groups above 
this mass form significantly more stars in the AREPO simulations 
compared to GADGET runs. Interestingly, this difference in large 
haloes first starts to show up at around z — ?> (upper right panel), 
which roughly coincides with the time when the global star forma- 
tion rates start to deviate, as can be seen from the lower left panel 
of Fig.[TT] This implies that the massive end of the halo population 
is mainly responsible for the higher star formation rate in the mov- 
ing mesh calculations at late times. These massive systems form at 
late times and systematically create more stars in AREPO than the 
corresponding GADGET calculations, driving the global SFH to a 
different behaviour in the two codes. 

The only way such a difference in the SFH can occur is 
a more efficient cooling behaviour in AREPO in haloes above 
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Figure 15. The two panels show the mean mass-weighted net cooHng rate ((X] {Ci — Hi))/{^ rrii)), where Ci and Hi are the cooling and heating rates 
per unit mass, as a function of hydrogen number density at z = 2 (left) and z = (right). We set the net cooling to zero for particles/cells that experience net 
heating. At high redshift the cooling rates agree well between different codes, but towards lower redshifts SPH shows systematic differences compared to the 
moving mesh approach. At z = 0, AREPO shows a significantly higher mean cooling rate directly below the star formation density threshold. We note that 
the cooling rates agree well outside of the shown density range, i.e. the discrepancies in cooling occur primarily just below the threshold. 



^ lO^^/i~^M0, allowing more gas to accumulate at the bottom 
of halo potentials and above the density threshold for star forma- 
tion. This interpretation also agrees with our findings for the mass 
fractions and volume fractions in the different gas phases, where 
GADGET shows at all resolutions significantly more hot gas than 
AREPO. Furthermore it agrees with the gas density PDFs in Fig.[TO] 
which show that AREPO has a larger fraction of dense gas slightly 
below the star formation density threshold. As we will discuss in 
the next section, the culprit of the reduced cooling in GADGET lies 
in different hydrodynamical dissipation rates in the outer parts of 
haloes compared with AREPO. In Paper III of this series, we fur- 
thermore show that AREPO 's more accurate treatment of hydrody- 
namical fluid instabilities strongly affects the stripping of gas out of 
gas clumps that fall into haloes. This in turn leads to significant dif- 
ferences in how the thermodynamic structure of haloes is affected 
by the hierarchical merging process, contributing to the cooling dif- 
ferences. 

We note that the different star formation histories also result 
in a different distribution of stellar ages in individual galaxies. This 
can already be seen in Fig. [3] and Fig.|5] where not only the mor- 
phology of the stellar distribution is different, but also the average 
age as can be inferred from the colour scale. The GADGET galaxies 
look clearly redder overall than the AREPO galaxies, which are on 
average bluer. To quantify this in more detail, we show in Fig. \T3\ 
distribution functions of the formation redshifts of the stellar pop- 
ulations at z = 0, z = 2 and z = 3, for our two highest resolution 
simulations. This figure demonstrates that at high redshift (z = 3) 
the distributions agree reasonably well in shape. But at z = 2, a 
clear trend is already visible, because the AREPO distribution be- 
comes biased towards slightly younger stars. This has to do with 
the slight shift of the peak of the SFH and the different late time 
behaviour of the SFH found in Fig. [TT] The difference in the dis- 
tributions becomes even more pronounced towards lower redshift 
and is clearly evident at z = 0. At that time the AREPO distribu- 
tion is quite distinct from GADGET and shows a significantly larger 



fraction of young stars. The reason for this lies in the larger star 
formation rate at late times, as discussed above. 



4 GAS COOLING 

4.1 Cooling emission and temperature evolution 

The differences between AREPO and GADGET discussed so far 
point towards a different cooling behaviour between the two codes. 
To demonstrate this more clearly. Fig. [14] shows the total net cool- 
ing rate (Y^ mi{Ci — Hi)), where d and Hi are the cooling and 
heating rates per unit mass as a function of redshift, accounting for 
all gas in our simulations. In the case of net heating we set this rate 
to zero. The figure demonstrates that at late times, the cooling rates 
of all AREPO runs are higher than those of corresponding GAD- 
GET runs. We note that this is true for all three resolutions, yield- 
ing a clear separation between the set of blue curves (representing 
the SPH simulations) and the red curves (representing the moving 
mesh calculations). Interestingly, the difference partially reverses 
at high redshift, where the cooling rates of AREPO at the interme- 
diate and the low resolution are lower than those of GADGET. This 
is a consequence of the more compact galaxies in GADGET at these 
high redshifts, which result in larger densities and therefore more 
cooling radiation. This high-2; cooling rate difference decreases and 
nearly disappears once we reach the highest mass resolution. In that 
case the cooling rates of GADGET and AREPO agree reasonably 
well at high redshifts, as shown in Fig. [14] In contrast, the quite 
large and systematic difference at low redshifts persists indepen- 
dent of resolution. 

Further insight into the origin of the cooling difference is pro- 
vided by Fig. [15] which shows the mean mass-weighted net cool- 
ing rates ((X^m^C'i — Hi)) / {^rm)) at redshifts z — 2 (left 
panel) and 2; = (right panel) as a function of hydrogen num- 
ber density. If the temperature structure of the gas at fixed density 
would agree between all simulations, then the mean mass-weighted 
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Figure 16. Mean mass-weighted temperature of the gas in the different sim- 
ulations. In the inset we divide each curve by the mean of all curves. In 
agreement with the findings for the SFR and the cooling rates, the mean 
temperature of all GADGET simulations is higher than that of the AREPO 
runs at nearly all redshifts. 



cooling rates should also agree for a given hydrogen number den- 
sity. But as Fig. [15] shows, this is not the case and some system- 
atic differences between GADGET and AREPO are present. Inter- 
estingly, the curves in Fig. [15] differ however only over a limited 
density range corresponding to the diffuse gas in haloes, begin- 
ning at the star formation density threshold (marked by the vertical 
green line) and extending to lower densities. In that density range, 
AREPO shows a significantly larger mean cooling rate at all reso- 
lutions, which must be due to a slightly lower gas temperature in 
the mean at these densities. We note that the magnitude of the dif- 
ference is however redshift dependent and becomes smaller as we 
go to higher redshifts, as seen by comparing the 2; = and 2; = 2 
panels in Fig. [15] 

At other hydrogen densities, including the range not shown 
in Fig. [15] the SPH and moving-mesh cooling rates agree well for 
all our runs. Above the threshold for star formation this is how- 
ever not surprising, because here the effective equation of state pro- 
duces a tight correlation between density and temperature (as seen 
in Fig. [7]). This leads to a nearly one-to-one mapping of density to 
temperature in this regime, and explains the two-peak structure of 
the curves in Fig.[T5]at high densities, which simply reflects the pri- 
mordial cooling curve with its characteristic hydrogen and helium 
peaks. 

Further evidence for a different temperature structure in the 
two simulation runs is provided by Fig. [16] which plots the mean 
mass-weighted temperature in the whole simulation volume as a 
function of time. In the inset we divide each curve by the mean 
of all the curves to emphasise deviations. We find that the mean 
temperature of GADGET simulations is ^ 15% higher than that of 
the AREPO runs. The higher temperature can explain the reduction 
in cooling emission of the SPH simulations relative to our moving- 
mesh calculations, and it is consistent with our above findings for 
the SFR evolution and the cooling emission. It thus appears that 
the origin of the low redshift discrepancy must lie in a different 



efficiency of non-adiabatic heating processes in the gas, as this is 
required to explain differences in the temperature distribution at a 
given density. 

4.2 Dissipative heating in haloes 

It is widely appreciated that the intracluster medium of galaxy clus- 
ters is partially supported by subsonic turbulence ( Schuec ker et al.| 
2004 ), which is created by curved accretion shocks around the 
haloes, and more importantly, by the hierarchical infall of struc- 
tures. Indeed, a number of numerical studies have analysed turbu- 
lence in the intracluster medium fPolag et al.| |2005| |Vazza etaL] 
|2009l [Wdarnini 2011, lapichino et al. 2011). A similar level of 
turbulence can also be expected in smaller haloes that are large 
enough to support quasi-hydrostatic atmospheres. It has further 
been shown that shock heating is not only important in strong 
shocks at the outer accretion radius, but is also significant in weaker 
flow shocks that occur in large parts of the halo volume ( P frommer| 
|et al.|2006| . Physically, the dissipation associated with shocks and 
with the decay of turbulence occurs through microphysics on very 
small scales. In our numerical approach, we neglect the physical 
viscosity (which is assumed to be effectively zero on all resolved 
scales; this is why we employ the Euler and not the Navier-Stokes 
equations; see e.g. |Munoz et al.| ( |2012| ) for a Navier-Stokes im- 
plementation of AREPO) and account for the necessary dissipation 
through numerical viscosity. In SPH this is provided explicitly in 
terms of an artificial viscosity, whereas in AREPO it is introduced 
implicitly through the solutions of the Riemann solver and cell- 
averaging. 

The good conservation properties of SPH allow it to capture 
one-dimensional shock waves quite well, even though the post- 
shock velocity field typically shows substantial noise ( |Springel| 
|2010b[ [Abell[2011j . This already hints that SPH may not be par- 
ticularly accurate for subsonic flow phenomena, such as the tur- 
bulence we expect in the virialised gas of newly formed haloes. 
Indeed, |B auer & Springel| pOi 1 ) have systematically compared 
driven isothermal subsonic turbulence for GADGET and AREPO, 
using the same versions of the codes we employ here. They find 
that SPH fails quite badly to account for a turbulent cascade in the 
subsonic case, whereas a Kolmogorov scaling is obtained for the 
moving-mesh code. This happens despite identical driving fields 
in both cases, and is ultimately caused by inaccurate gradient esti- 
mates in SPH and different dissipation as a function of scale. The 
SPH simulations dissipate most of the energy already close to the 
driving scale in the subsonic case, whereas in AREPO efficient dis- 
sipation happens only on much smaller scales, so that a self- similar 
turbulent cascade can develop over some inertial range. In addition, 
"Bauer & Springel'f201 1 ) find that SPH exhibits a second maximum 
in the dissipation on very small scales of order the mean interpar- 
ticle separation. Here the subsonic noise of SPH is dissipated by 
the artificial viscosity. Interestingly, the total amount of dissipa- 
tion on these scales is quite independent of the artificial viscosity 
value itself. While a higher viscosity reduces the amplitude of the 
small-scale SPH noise, the dissipation rate on these scales still re- 
mains roughly the same, suggesting that the cause of the small- 
scale noise, which originates in errors in the pressure gradient esti- 
mates, cannot be reduced effectively with a different viscosity set- 
ting. 

There are good reasons to expect that these differences in the 
dissipation properties of the hydrodynamical schemes may also in- 
duce important effects for the thermodynamic structure of cosmo- 
logical haloes, which in turn can easily give rise to variations of 
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Figure 17. Dissipation rate profiles for haloes formed in non-radiative simulations. Each panel is for a given redshift between z = A and 2; = (as labelled), 
and stacks the results for all haloes contained in the mass range 3.0 x 10^^ Mq ^ M200 ^ 1-0 x 10^^ M©. The number of haloes that are 
averaged over is indicated in each panel. The red lines show results for the 2 x 256^ non-radiative simulation with AREPO, the blue lines the corresponding 
results for the GADGET simulation. Dashed lines give the equivalent 2 x 128^ results. 



the amount of gas that cools out of these haloes. To examine this 
further, we have carried out two auxiliary non-radiative simulations 
at resolutions of 2 x 128^ and 2 x 256"^ particles/cells. These sim- 
ulations use the same box- size and parameters as our galaxy for- 
mation runs, except that cooling and star formation were not in- 
cluded. The use of non-radiative simulations allows us to cleanly 
employ the method of 'Bauer & Springel|( |2011| ) for measuring the 
instantaneous dissipation rate of particles and cells, respectively. In 
the case of SPH, this can simply be done by measuring the work 
per unit time done by the artificial viscosity forces, which repre- 
sents the sole source of entropy generation in this method. For the 
moving-mesh code, we instead also advect the thermodynamic en- 
tropy between the cells, and then measure the rate of entropy gen- 
eration by comparing the state of a cell at the end of a timestep 
computed adopting energy conservation with the state expected if 
the entropy would have remained constant. The inferred rate of en- 
tropy production can then also be converted into a fiducial heating 
rate. 

In Fig. [17] we compare stacked profiles of the spherically 
averaged dissipative heating rate of haloes obtained in this way. 
For the plots, we have selected all haloes in the virial mass range 
3xlO^^/i~^M0 < M200 < 1 X 10^^ Mo, where M200 refers 
to the mass inside a radius i?2oo that encloses 200 times the critical 
density. We have then placed 20 logarithmic bins onto the radial 
range 0.01 x i?2oo to 4.0 x i?2oo, and produced mass- weighted 



averages of the dissipation rate per unit mass, du/dt, that we mea- 
sured for each particle/cell. In order to allow a calculation of an 
average profile by stacking the haloes, we express the dissipation 
rate in dimensionless units by multiplying it with the Hubble time 
at the given redshift, and by normalising it to V200 of the particular 
halo, where V200 = \/ GM200 / R200 is the circular velocity at the 
virial radius. In the individual panels of Fig.[T7] we show results for 
different redshifts, ranging from z = 4 to 2; = 0, and we compare 
AREPO with GADGET at the two resolutions considered here. 

We see that there is a clear systematic difference in the dissi- 
pative heating rates as a function of halo-centric distance. AREPO 
produces more heating in the infall region directly outside the virial 
radius whereas GADGET shows more heating in the outer parts of 
virialised haloes, where most of the gas mass is located. This sys- 
tematic difference is present at all redshifts in large haloes. We also 
note that the nature of the difference is robustly preserved at differ- 
ent resolutions, even though there seem to be small residual trends 
with spatial resolution. Interestingly, at fixed halo mass, the rela- 
tive level of dissipation in the innermost parts of haloes increases 
in the mesh-code with time, which may be related to the build-up 
of a turbulent entropy core in these haloes. In Fig. [18] we consider 
dissipation profiles as a function of halo mass at z = 0, which 
however does not reveal any clear trend with halo mass. 

Our interpretation of the systematic difference revealed by 
Figs. [17] and [18] is that: (1) shocks are captured more efficiently 
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Figure 18. Dissipation rate profiles for haloes formed in non-radiative sim- 
ulations. The different panels are for 2; = haloes contained in different 
mass ranges, from more massive (top) to less massive (bottom), as labelled. 
Note that the bottom panel is identical to the last panel in Fig.[n] which is 
here repeated for ease of comparison. The solid lines show results for the 
2 X 256^ non-radiative simulations, while the dashed lines give the equiv- 
alent 2 X 128^ results. 



by AREPO in the accretion regime of haloes, and (2) AREPO strips 
gas out of inf ailing objects more efficiently there; both effects con- 
tribute to a stronger dissipation in this region. In contrast, SPH stops 
infalling gas less efficiently, leading to more dissipation at smaller 
radii. A further contributor to the heating there lies in the dissipa- 
tion of the subsonic noise in SPH, which is constantly recreated 
in this region by tapping free energy from the disturbances that 
impinge on the haloes from the outside. As a net result of this dis- 
sipation of subsonic noise in the outer parts of haloes, the SPH 
simulations end up hotter overall (see also Fig.[T6|. Furthermore, 
this difference in the heating occurs in a region where one expects 
the cooling radius of many haloes in cosmological radiative simula- 
tions, thereby directly affecting the amount of cold gas produced in 
the quasi-hydrostatic cooling flows in large haloes. We note that the 
measurement of the dissipation rate is difficult due to its high time 
variability. We use above stacked profiles of small sets of halos to 
mitigate this problem, but even then the inner logarithmic bins av- 
erage over much smaller gas mass than the outer parts. This implies 
that the profiles in the inner 10% of the virial radius are relatively 
noisy. But this inner part is largely irrelevant as the impact on the 
cooling rates comes from larger radii as we discussed above. 

4.3 Mixing in haloes 

Another reason why differences in the global star formation rate 
occur between AREPO and GADGET lies in the different accu- 
racy with which hydrodynamical fluid instabilities are treated. In 
particular, we expect that for the moving-mesh code cold gas can 
be stripped more efficiently from galaxies as they fall into larger 
haloes, and this gas is then mixed with the halo's diffuse gas, which 
also lowers its temperature. 

To demonstrate this difference more explicitly in our cos- 
mological runs, we performed special test simulations of a 
10/i"^Mpc box with 128^ 
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SPH particles/cells. In Fig. 
show the sum of the total amount of star-forming gas mass and stel- 
lar mass in this box as a function of redshift. We note that an SPH 
particle/cell is star-forming if its density is higher than the threshold 
of the subresolution star formation model. The solid lines show the 
result of calculations with our standard star formation prescription, 
i.e. these simulations are equivalent to the main runs presented in 
the paper. However, for the dashed lines we turned off the creation 
of stellar particles, keeping everything else the same. In this case, 
the cold gas accumulates in the galaxies and is supported by the ef- 
fective equation of state expected for gas at these densities, except 
that the gas is not depleted and does not turn into stars at the nor- 
mal rate. For these runs, the mass in the figure therefore refers to 
the total amount of star-forming gas in these simulations (no stars 
are present). Interestingly, the solid and dashed curves overlap well 
for GADGET (blue) in Fig. [19] showing that once gas has overcome 
the density threshold it will never return to lower density. Stripping 
of gas out of galaxies does not appear to happen in any significant 
way, otherwise we would expect that the run without star formation 
should end up with a smaller amount of collapsed baryons. Instead, 
the SPH result is consistent with no stripping at all, i.e. once a gas 
particle has cooled, it never returns to lower density even though 
the galaxy may be subject to substantial shear flows upon halo in- 
fall. This behaviour has also been found by HeB & Springel (201 1, 
submitted) in a comparison of galaxy- wind interactions in SPH and 
the VPH technique (HeB & Springel 2010 ). 

For AREPO the situation is very different. As Fig. [19] shows, 
the sum of stellar mass and star-forming gas mass is larger than the 
amount of star-forming gas in the run without star particle creation. 
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Here the relevant fluid instabilities for stripping (like the Kelvin- 
Helmholtz instability) are significantly better resolved and facili- 
tate a substantial mass loss of infalling overdense blobs/subhaloes. 
Because of this, the sum of the mass of stars and of star-forming 
gas is higher in AREPO for the run with star particle creation than 
for the fiducial run where this is suppressed. In the latter simula- 
tion, there is simply more gas around that can be stripped again, 
while in the run with star particle creation, baryons that have been 
converted to stars do of course not suffer from fluid instabilities 
anymore. Apart from affecting the cooling in haloes, this difference 
in the stripping and mixing efficiency also modifies the dynamical 
friction timescale of infalling gas clumps, as we examine in more 
detail in Paper III of this series. 



5 GENERIC ISSUES WITH SPH 

Various modifications have been proposed to the conventional im- 
plementation of SPH as in GADGET to improve its reliability in cer- 
tain cases. For example, changes to the computation of smoothed 
pressure gradients (Abel 2011 ) or the addition of an artificial ther- 
mal conductivity to the equations of motion (e.g. |Price|2 008 ) enable 
SPH to better handle shearing interfaces and the onset of Kelvin- 
Helmholtz instabilities. Also, a number of studies focused on the 
development of improved artificial viscosity parameterisations in 
SPH codes (Morris 1997||Cullen & Dehnen|20T0l ), with the goal of 
reducing the numerical viscosity away from shocks. 

Given the results presented in this paper, it is tempting to ar- 
gue that one or several of these modifications of "standard SPH" 
may lead to much better agreement with the moving-mesh results or 
even resolve the discrepancies. While we cannot exclude this pos- 
sibility, we note the above refinements are ad hoc in the sense that 
they are designed to mitigate against particular problems in some 
circumstances, and may have unwanted side-effects in other situa- 
tions. While these side-effects may often be benign, the proposed 
modifications of SPH do not address other, more generic problems 
with this technique. In this section, we summarise these issues in 
order to put our findings into context with the recent literature on 
SPH techniques. 

One issue that has rarely been discussed in the cosmological 
community concerns the convergence properties of SPH. This is 
manifested in at least two ways, the first having to do with local 
smoothed estimates in SPH, and the second is related to the fact 
that SPH is not formally Lagrangian. Below, we first address the 
question of under what conditions convergence is expected in SPH 
and how this influences the relation between spatial resolution and 
the total particle number. Then, we discuss the pseudo-Lagrangian 
character of SPH and its implications for defining convergence with 
this method. Finally, we briefly suggest ways in which these issues 
might be dealt with. Incidentally, they all would have the effect of 
making SPH resemble a moving-mesh scheme like AREPO. 



5.1 Convergence in SPH and nearest neighbour number 

For a real fluid, we define the density, p(x), to be a continuous 
function of space, determined by averaging over many molecules 
locally around x. In order to formulate a discrete SPH counterpart 
to this (and this applies to all fluid variables, as well as the equations 
of motion) a two-step procedure is applied. 

First, we introduce a smoothed version of the original density 
field, (p) (x), by convolving p(x) with a smoothing kernel VK(x, /i) 




Figure 19. Test simulation of a 10/i~^ Mpc box with 128^ SPH parti- 
cles/cells. The y-axis shows the total amount of star-forming gas and stellar 
mass in the simulation volume. Solid lines show the result of a calcula- 
tion with our standard star formation prescription. For the dashed lines we 
turned off the creation of stellar particles; i.e. there are no stars formed but 
the gas still cools and is supported by the multiphase equation of state. The 
solid and dashed curves overlap well for GADGET, which shows that once 
gas overcomes the density threshold it either forms stars or stays above 
the threshold. There is no loss of star-forming gas mass. For AREPO, the 
amount of stellar mass and star-forming gas mass is larger than the amount 
of star-forming gas if stellar particle creation is turned off. This is because 
AREPO resolves fluid instabilities better and therefore allows significant 
gas mass loss from infalling satellites. Here gas that once has been above 
the threshold for star formation may well be mixed with halo gas and reach 
much lower densities again. On the other hand, once stellar particles are 
created from dense gas, the total amount of baryonic material that can be 
lost from subhaloes is reduced, which explains the difference seen in the 
two moving-mesh calculations. 

according to 

(p>(x) = yp(x')VK(x-x',/i)dV", (1) 

where the integral is over all space and h is the smoothing length. 

Second, the continuous smoothed field {p) (x) is replaced by 
a discrete quantity, (p)^(x), so that it can be represented compu- 
tationally. This is done by regarding p{yi)dV' as a mass element 
dm' and partitioning the fluid into a set of N discrete mass ele- 
ments so that the above integral can be approximated by a discrete 
sum: 

^ngb 

(p)(x) ^ (p),(x) = £ m,Ty(x-x„/i(x)), (2) 

where rrij is the mass of fluid element j, Xj is its spatial loca- 
tion, and the expression allows different fluid locations/elements 
to have different smoothing lengths. In principle, the sum extends 
over all N fluid elements. However, in order that the smoothed den- 
sity {p) (x) approaches the continuum limit represented by p(x), it 
is necessary that VK(x, /i) is spatially localised. So in practice a lo- 
calised kernel with compact support is adopted, implying that the 



© 2011 RAS, MNRAS 000,[T]j35] 



26 M. Vogelsberger et al. 



discrete sum extends not over all N fluid elements but over a subset 
A^ngb of those neighbouring a certain point in space. 

Now, we can ask about the conditions that must be satisfied 
for the smoothed, discrete version of the density field to asymptote 
to the actual continuum limit. That is, under what circumstances is 
the following true? 

(p),(x) ^ (p>(x) ^p(x). (3) 

Consider the second of these requirements first. In order for 
(p)(x) ^ p(x) we must have VK(x, /i) ^ (x) as /i ^ 0, as im- 
plied by the definition of {p) (x) in the above integral. This also 
requires that N ^ oo, otherwise there exist not enough particles 
around a given spatial location to form smoothed averages. 

Turning to the first requirement above, (p)^(x) 
will be satisfied only if at the same time we enforce the condition 
Nngh oc . This has not been emphasised in the cosmological lit- 
erature, but from the defining relation above for (p)^(x) it is clear 
that the discrete approximation will not asymptote to the smoothed 
density field unless this limit is taken. Or, to put it another way, if 
the number of neighbours is held fixed as /i ^ 0, there will be a 
constant source of error present on the most well-resolved scales 
that will not vanish as the resolution and number of particles is in- 
creased indefinitely (see also |Rasio|2000[ [Read et al.|2()T0| . (For a 
discussion of this requirement in the context of elasticity, see e.g. 
[Belytschko et al. ( 1998 ) .) 

To the best of our knowledge, the optimal way to enforce this 
requirement has not been established. However, there is little doubt 
that this must be addressed in order that SPH provides properly 
convergent solutions, as argued recently by Robin son & Monaghan| 
( [201 1 ): "When performing a convergence study using SPH, it is 
important to vary both the number of particles ... as well as the ra- 
tio of smoothing length to particle spacing [to increase the num- 
ber of neighbours]. [Otherwise] the error due to the summation 
interpolant [remains] constant [as the total particle number is in- 
creased]." We emphasise that this requirement is not fundamentally 
an issue with SPH, but simply reflects the fact that a numerical ap- 
proximation to an integral in the form of a discrete sum will not 
converge to the true answer unless the number of points where the 
integrand is sampled is made larger. 

An increase in the number of neighbours is also warranted to 
reduce the gradient errors in SPH, which give in fact rise to a siz- 
able "zero-th order error" fRead e t al.|2010| ). These gradient errors 
seriously degrade the accuracy of SPH in subsonic flow problems 
( [Springel|2010b| ) and are a primary cause for problems in the rep- 
resentation of subsonic turbulence ( [Bauer & Springel|2011| ). How- 
ever, in practice, simply increasing the number of neighbours is met 
by a serious obstacle, because it invokes the so-called clumping in- 
stability in which particles located in the inner parts of the kernel 
are pushed together by the pressure forces of the surrounding par- 
ticles, forming little clumps that frustrate attempts to reduce the 
error of the kernel sums in this way. Partially circumventing this 
problem and allowing a higher neighbour number requires differ- 
ent kernel shapes than are commonly employed ( [Read et al.|2010[ 
[Price|2012] ), but such kernels merely have different stability bands 
that still impose severe restrictions on the number of neighbours 
that may be used. In essence, due to the clumping instability, estab- 
lishing formally convergent SPH solutions is an unsolved problem 
because the path of increasing the neighbour number is blocked in 
practice. 



5.2 Implications for SPH convergence rates and efficiency 

The requirement that the number of neighbours should increase to 
reduce errors as the total number of particles is made larger has also 
serious implications for the efficiency of SPH. As an example, con- 
sider a uniform fluid in a cube of side-length L represented by N 
SPH particles. (For more general circumstances the argument gen- 
eralises if we think about small scales on which the fluid properties 
are nearly uniform.) The mean separation between the particles, ro, 
is 

If we take /i oc ro, as has been done conventionally in SPH codes 
used in cosmology, then the smoothing length will shrink in propor- 
tion to ro and, so, the number of neighbours will be nearly constant. 
In a sense, this maximises the spatial resolution, but leaves a fixed 
source of error in the discrete sums dRead e t aL][2010[ [Robinson] 
[& Monaghan|201l] ), so this path is in principle not appropriate for 
achieving proper convergence with SPH. 

If, instead, we set h oc then the fundamental relation be- 
tween N and h becomes N oc h~^^^ . What are the conditions on 
/3 so that we can meaningfully construct a procedure that converges 
numerically? Clearly, /3 < 1, otherwise h will not decrease more 
slowly than ro as N increases. Also, we must have /3 > 0, as the 
choice /3 = would mean that the spatial resolution is constant, 
independent of N. The optimal value for /3 is undetermined. One 
physically appealing choice would be /3 = 1/2, because then h 
would be the geometric mean between the mean interparticle sepa- 
ration, ro, and the size of the system, L. In that case, the relation be- 
tween N and h becomes N oc , implying a steeply rising com- 
putational cost with resolution. It is possible that a slightly larger 
value of 13 could be preferred in terms of CPU costs, but it is clear, 
however, that in order to extend the dynamic range in spatial scales 
resolved by SPH in a way that guarantees eventual convergence, 
it is absolutely necessary to increase the number of particles more 
aggressively than has been advocated previously. The dilemma, of 
course, is that the clumping instability makes this highly problem- 
atic. 

Ignoring the clumping issue for the moment, the implica- 
tions for the convergence rate of SPH are in any case important. 
When the common practice in the field is followed and the num- 
ber of neighbours is held constant, jSpringelj ( |2010b| ) reported a 
global LI error for a vortex flow (the Gresho test) that scales as 
h^'^ with resolution, as opposed to LI oc h^'^ for AREPO. We 
can then ask how the ratio of the CPU-time cost of simulations 
with the two schemes varies with the size of the error. For sim- 
ulations in 3D, the computational cost scales for both codes as 
where three powers of h come from the spatial dimensions, 
and one additional power enters due to the reduction of timesteps 
for better resolution. This then implies that the cost ratio scales 
as CPUgadget/CPUarepo oc L1~^-^^ with the error of the 
calculation. Reducing the error by a factor of 10 requires there- 
fore about a ^ 700 times higher effort in SPH than in AREPO. 
This conclusion is problem dependent and can also be affected by 
specific details of the SPH algorithm, like the kernel shape. How- 
ever, the question of computational efficiency of different numeri- 
cal schemes is best phrased in terms of such convergence rate com- 
parisons. Here the Monte Carlo character in the approximation of 
the SPH kernel sum invariably introduces a serious disadvantage 
for SPH compared with the moving mesh approach, as outlined 
above. 
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Figure 20. Schematic representation of a thin disk with gas evolving on cir- 
cular orbits. The dark blue region shows the gas that is within the smoothing 
region of one SPH particle in this disk. The upper panels show the situ- 
ation for soHd body rotation, whereas bottom panels depict the situation 
for differential rotation. The leftmost panels give the initial setup at time 
t = The middle panels show the true gas distribution of the disk a short 
amount of time later. For solid body rotation, the initial blue region retains 
its shape. This is different for the differentially rotating disk, where the blue 
gas patch experiences the shear in the disk and gets distorted. The pan- 
els on the right-hand side show the corresponding SPH representations of 
the disk at this later time. Clearly, the distortion in the differentially rotat- 
ing case is not captured correctly by SPH since the smoothing region does 
not change in shape. SPH is therefore only pseudo-Lagrangian, as opposed 
to AREPO, which implements a quasi-Lagrangian scheme that allows for 
mass exchange between cells in a manner consistent with the hydrodynamic 
equations of motion. The dashed circle in the bottom middle panel illus- 
trates a re-partitioning of the gas into SPH particles at the updated time and 
demonstrates why mixing is suppressed in SPH at the resolution scale (see 
text). 

A related conclusion, but focusing on the efficiency impact 
of the artificial viscosity parameterisation in SPH, was reached by 
[Bauer & Springel| (12011 ). If the goal is a representation of sub- 
sonic turbulence with a certain Reynolds number IZe, they showed 
that the computational cost of SPH scales at least as CPU oc VJ^,. 
In contrast, in the mesh-based treatment of AREPO, the dissipation 
scale is tied to the mesh resolution, implying a computational cost 
for reaching a certain Reynolds number that scales as oc 7^^ • Given 
that our cosmological simulations with AREPO are only about 30% 
slower than GADGET for simulations involving self-gravity, for the 
same number of resolution elements, this means that at a given 
computational cost, AREPO resolves much larger Reynolds num- 
bers than SPH. In fact, to reach the same Reynolds numbers as in 
our present moving-mesh simulations, it appears plausible that a 
factor ^ 100 - 1000 more SPH particles than AREPO cells would 
be required. 

5.3 The pseudo-Lagrangian nature of SPH 

SPH is often referred to as a Lagrangian algorithm, but this is not 
formally correct and we suggest it is perhaps better to characterise it 
as a "pseudo-Lagrangian" technique. To appreciate the reasons for 
this, consider the following example, which also highlights that this 
distinction is intimately related to issues of (suppressed) mixing in 
SPH. 

Suppose gas is revolving in a thin disk on circular orbits, and 
imagine that the gas is partitioned at some instant into regions that 
are then represented by SPH particles, with the smoothing done 



over a circular area. Each SPH particle will comprise gas from an 
area around its centre, extending outwards in radius of order the 
local smoothing length, as shown for one such SPH particle in the 
left panels of Fig. [20|at time zero. Consider now how the system 
will look like a short time later, t > 0, after the disk has rotated by 
a small angle. 

If the disk is in solid body rotation, as in the upper panels of 
Fig. [20| the fluid initially contributing to the SPH particle will oc- 
cupy an area identical to the smoothing region of the SPH particle 
at time t > 0, as indicated by comparing the "true" situation in the 
upper middle panel with the SPH version in the upper right panel. 
In this case, the SPH representation is faithful to the actual equa- 
tions of motion because the fluid contributing to the SPH particle at 
time zero is the same as that at At, and the region bounded initially 
by the SPH smoothing volume has not changed its shape. 

However, suppose instead that the disk is rotating differen- 
tially, turning around more rapidly in the inner parts than in the 
outer ones (lower panels of Fig. |20]). The "true" situation shown 
in the middle panel tells us that the gas contributing to the area 
bounded by the SPH smoothing volume at time zero will be 
stretched out owing to shear. However, in the SPH formulation, 
shown in the lower right panel, the material initially in the SPH 
smoothing volume is forced to remain tied to this area and is not 
allowed to shear, inconsistent with the equations of motion of the 
real fluid. 

A true Lagrangian picture would allow fluid elements to be 
deformed in the presence of shear, but this is inhibited by the SPH 
approach on scales smaller than those set by the smoothing pro- 
cedure. Hence, while SPH retains some characteristics of a La- 
grangian method, it does not evolve the fluid entirely in a manner 
consistent with the equations of motion and should, in this sense, 
be termed "pseudo-Lagrangian." 

Unfortunately, the error incurred by the approximations inher- 
ent to the SPH formalism is highly problem dependent and can- 
not be estimated based solely on the discretisation procedure. For 
example, there is no error made in the case of a disk in solid 
body rotation, or for a uniform disk rotating differentially, or for 
gas motions that are limited to expansion or contraction in three- 
dimensions. However, this is not true for shearing flows involving 
fluids with distinct internal properties; here the implicit "remap- 
ping" involved in enforcing a spherical shape for the SPH smooth- 
ing volume, which would change the local composition of the in- 
ternal properties, is ignored. This limitation is ultimately the reason 
why SPH does not handle mixing accurately, as we illustrate in the 
bottom, middle panel of Fig. [20| Suppose at the updated time we 
were to re-partition the gas into new SPH particles, as indicated by 
the dashed circle in this frame. The gas associated with this particle 
should include gas from the original particle, but also gas from the 
surrounding flow. This would allow mixing to occur between the 
gas in the original particle and nearby parts of the flow, but is not 
allowed to occur in SPH, by construction. 

We note that AREPO does not suffer from this defect. In this 
code, cells are not allowed to become arbitrarily distorted, in the 
interests of efficiency and accuracy, and so AREPO is also not for- 
mally Lagrangian. However, it still evolves fluids correctly when 
deviations from strict Lagrangian behaviour occur by allowing for 
mass exchange between cells, in a manner consistent with the equa- 
tions of motion. Therefore, this method should be termed "quasi- 
Lagrangian." 

A related issue arises in the context of N-body simulations of 
collisionless systems. There, a six-dimensional phase fluid is par- 
titioned into particles of fixed size that move through space in a 
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manner determined by the equations of motion. Because the phase 
fluid initially associated with a particular particle is always tied to 
that particle, the small-scale dynamics of the system will not be 
represented properly. However, in this case, unlike for the example 
discussed above, forces are strictly long-range and are less prone 
to inaccuracies in the small-scale distribution of the material than 
pressure gradients or viscosity. In that sense, the local distortions in 
the phase space fluid have fewer dynamical consequences than for a 
hydrodynamical fluid in three dimensions. Nevertheless, if an accu- 
rate description of the small-scale structure of collisionless systems 
is essential, these effects must be accounted for at some level (e.g. 
Vogelsberger et al.|2008| [Vogelsberger & White|2011b||Abel et al.| 
20TT] l 

5.4 Discussion 

We should emphasise that whether the limitations inherent to SPH 
lead to significant inaccuracies in the solution depend in detail on 
the circumstances. SPH can be expected to provide reliable results 
for flows that are kinetically dominated, or if the motions are con- 
trolled mainly by long-range forces. In these situations, errors in the 
local quantities are sub-dominant. For example, Bauer & Springel| 
([2011 ) show that while SPH does not accurately describe sub- 
sonic turbulence, in the supersonic regime, when the flow energy 
is mainly kinetic, GADGET and AREPO give similar answers. Also, 
comparisons between SPH and AMR codes have yielded compati- 
ble results for the structure of the intergalactic medium, as reflected 
in the properties of the Lyman-alpha forest (e.g. Regan et al. 2007^ . 
At these low densities, gravity dominates over internal energy and, 
moreover, the fluid is not subject to strong shear, so the above 
considerations are not critical. Even in some cases where shear is 
present, the approximations underlying SPH will not be significant, 
provided that gravitational forces are more important than local 
ones and the relevant evolution in flow properties occurs rapidly. 
Hay ward et al. (2012, in preparation) have demonstrate this explic- 
itly by simulating galaxy mergers using both GADGET and AREPO. 
They find that when the gas is treated using an effective equation of 
state, star formation rates during the course of a merger are nearly 
identical between these two codes. Thus, conclusions drawn from 
studies of galaxy collisions with SPH, such as the stellar profiles 
of merger remnants and their evolution (e.g. Hopkins et aH2008| 
2009|?|?j) or th e survivability of disks during mergers ( Springel & 
Hemquis tl2005 ), are likely robust to variations in the hydro solver. 

Unfortunately, this good level of agreement does not extend 
to the more complex flows associated with halos in cosmological 
environments, as we highlight in this paper. There, various phases 
of gas will be present in close proximity, often shearing relative to 
one another, leading to errors in simulations done with SPH like 
those we have identified here. Also, in the hydrostatic atmospheres 
of halos the hydrodynamic forces are comparable to gravity forces, 
and the subsonic turbulence present in these regions is affected by 
gradient errors in SPH. Without a formal convergence criterion, the 
consequences of these errors are difficult to assess, because even if 
a solution may plateau as the number of SPH particles is increased, 
this by no means guarantees that the correct answer is being ob- 
tained. In particular, the solution may exhibit "false convergence," 
appearing to reach a converged solution, while fixed sources of er- 
ror from e.g. summation interpolant are still present (Robinson &" 
[Monaghan|2011|). A case in point is the long-standing discrepancy 
identified for the central cluster entropy predicted by non-radiative 
SPH and mesh-based simulations, first identified in the Santa Bar- 
bara cluster comparison project ( [Frenk et al.|1999| ). 



The above considerations motivate thinking about ways to 
cure the defects inherent to SPH so that it can provide solutions of 
comparable accuracy and resolution to those obtained with a mov- 
ing mesh approach like AREPO. For example, the artificial viscosity 
typically used in SPH codes to handle shocks could, in principle, be 
eliminated by incorporating a Riemann solver into SPH. Pioneer- 
ing efforts along these lines have been made by |Inutsuka| ( |2002| | 
and [Mur ante et al . | ( |20 1 1 j ) . However, these implementations solve 
the Riemann problem for each particle-neighbour pair, and this ap- 
proach will become prohibitively expensive as long as the smooth- 
ing procedure requires averaging over an increasingly large number 
of neighbours to achieve convergence as the total particle number is 
increased. We note that alternatively a grid could be introduced into 
SPH just for the purposes of solving the Riemann problem around 
each particle. This grid would necessarily have to be adaptive, and 
hence resemble the Voronoi tessellation used in AREPO. 

In order to eliminate the pseudo-Lagrangian character of SPH 
and allow for fluid elements to become distorted on scales small 
compared to a few smoothing lengths, as should occur in e.g. shear- 
ing flows, it is necessary to allow for mass exchange between par- 
ticles in a manner consistent with the equations of motion. This 
would have the added benefit of eliminating the mixing problem in 
SPH. First suggestions in this direction have recently been made 
jWadsley et al. 2008, Price 2008, Read et al. 2010, Rea d & Hay-| 
[field,2 011 ), but the best way to formulate such mixing terms is not 
clear. One possibility to unambiguously calculate the required mass 
flux between particles would be to utilise some kind of grid. Again, 
this grid would need to be adaptive, like the unstructured mesh in 
AREPO, reinforcing the notion that our new moving-mesh approach 
quite naturally addresses several of the generic issues with SPH. 



6 CONCLUSIONS 

In this study, we have presented the first cosmological hydrody- 
namical simulations of structure formation carried out with the 
novel AREPO moving-mesh code. Compared to the widely em- 
ployed SPH technique, this method promises an important gain 
in accuracy in the numerical treatment of gas dynamics for sim- 
ilar computational costs. In order to understand the implications 
of these differences for galaxy formation simulations, we have 
evolved the same initial conditions both with AREPO and the well- 
established SPH code GADGET. Both codes share the same high- 
resolution gravity solver, and incorporate an identical radiative 
cooling and star formation model. Differences in the outcome are 
hence tied to systematic s of the hydrodynamical solvers that are 
used. It is the primary goal of our paper to identify the main dif- 
ferences and their magnitude, as well as providing evidence for the 
primary sources of potential discrepancies. 

To achieve this goal, we have considered a primary set of sim- 
ulations in boxes of 20 Mpc that include a basic treatment 
of galaxy formation physics, at various resolutions ranging from 
2 X 128^ to 2 X 512^ particles/cells. In addition, we have carried 
out a few auxiliary simulations either in smaller boxes, or of non- 
radiative type, in order to more clearly measure particular numer- 
ical effects. For reasons discussed in Section 2.3, we have chosen 
to perform the comparisons between the codes at the same initial 
mass resolution, because then the computational costs are similar. 

Our principle findings may be summarised as follows: 

• The overall distribution of gas in density and temperature is 
broadly in agreement between SPH and AREPO, but there are some 
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Figure 21. Gas disk scale lengths obtained from exponential surface density 
fits of more than 50 galaxies at z = in the halo mass range ~ 4 x 
10^^ h~'^ Mq to ~ 8 X 10^2 h~'^ Mq. Clearly, the gas disks forming in 
the AREPO simulations are typically significantly larger than those in the 
corresponding SPH runs. 



subtle and important differences. In particular, the mean mass- 
weighted temperature in AREPO is slightly smaller than in GAD- 
GET. Also, the hot gas extends to somewhat lower density in GAD- 
GET and hence occupies a larger volume fraction. 

• The cosmic star formation rate density peaks at similar values, 
but at slightly lower redshifts in AREPO. At high redshift and at 
high resolution, the SFRs between the two codes are in good agree- 
ment, but towards lower redshift there are significantly stronger 
cooling flows in AREPO, causing a correspondingly larger star for- 
mation rate. This difference occurs primarily in haloes with mass 
larger than - 10^^ /i"^ M©. 

• We find that the two codes differ significantly in the dissipative 
heating rates within haloes, with AREPO producing more dissipa- 
tion in the halo infall regions, whereas SPH produces higher dis- 
sipative heating throughout most of the outer regions of virialised 
haloes. This difference is mainly responsible for the higher tem- 
peratures found in the SPH simulations, and the correspondingly 
weaker cooling. We argue that this dissipation in SPH is likely to 
be of spurious nature, and is a combination of viscous damping of 
SPH's inherent noise and the unphysical damping of subsonic tur- 
bulence injected into haloes in the infall regions. 

• Visual comparison of simulated galaxies shows considerably 
larger stellar disks and more extended and disky stellar distributions 
in AREPO compared with GADGET. Also, the halo gas surrounding 
the galaxies looks smoother and less clumpy in AREPO. Fig. [21] 
shows gas surface density disk scale length radii obtained from ex- 
ponential surface density fits of more than 50 galaxies at z = in 
the halo mass range 4 x 10^^ M© to 8 x 10^^ M©. 
This clearly demonstrates that the gas disks forming in the AREPO 
simulations are typically significantly larger than those in the corre- 
sponding SPH runs. We further note that the gas surface density of 
disk galaxies in the AREPO simulations follow very closely expo- 
nential profiles, which is typically not the case for the SPH runs. A 
more detailed analysis is presented in |Torrey et al.| ( |20lT] ). We note 



that the large disks in the AREPO simulations result even without 
strong feedback in the form of galactic winds or modifications of 
the star formation prescription. This is also demonstrated in simpli- 
fied test simulations in Paper III. The larger extent of disk galaxies 
is also demonstrated in Paper II, where we do not assume expo- 
nential profiles, but rather use a cut in surface density to determine 
a characteristic size. We also independently calculated half-mass 
radii, which show the same trends. All these results confirm the 
findings reported in Fig. [21] i.e. that AREPO produces in general 
larger disks than GAGDET. 

The above findings clearly suggest that there are significant 
quantitative differences caused in galaxy formation simulations 
by the choice of hydrodynamical technique. It appears that the 
limited accuracy of SPH for subsonic flow phenomena such as 
Kelvin-Helmholtz instabilities or subsonic turbulence induces im- 
portant differences in the predicted properties of galaxies at low 
redshift, affecting both their morphology and stellar mass. Because 
of the accuracy problems of SPH for certain subsonic test problems 
( |Agertz et al.]2007l|Sprmgel|2010b| Bauer & Springel 201 1) it ap- 
pears that SPH cannot be expected to reach highly accurate results 
in all regimes relevant for cosmic structure formation. At the same 
time, AREPO performs significantly better on these same problems, 
and yields generally a smaller error norm and higher convergence 
rate when scrutinised against problems with known analytic solu- 
tions ( Springel 2010a|b| ). We are hence confident that the AREPO 
results presented in this paper entail a more faithful treatment of 
cosmological hydrodynamics, especially in view of the discussion 
in Section 5.3. 

It is also reassuring that the AREPO runs improve the predicted 
morphologies of simulated galaxies, as our preliminary analysis 
suggests. In Paper II of this series we will back up this finding with 
a detailed analysis of the galaxy properties. Finally, in Paper III we 
provide a more detailed analysis of the relevant numerical effects 
responsible for the differences in mixing and cooling. The body of 
these results makes it clear that AREPO simulations provide a sig- 
nificant opportunity for the development of next generation mod- 
els of galaxy formation that promise to achieve a so-far unknown 
combination of accuracy, dynamic range, and faithfulness to the 
relevant physics. 
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Figure Al. Properties of the Voronoi mesh at z = for the different 
AREPO runs. The black Hne shows the characteristics for a mesh where 
128^ vertex points are randomly distributed. Our implemented regulari- 
sation scheme tries to keep the maximum face angle small to avoid very 
distorted Voronoi cells. This can clearly be seen in the upper left panel, 
where the random distribution deviates significantly from the regularised 
simulation distributions. The regularisation also affects the offset between 
geometrical centre of the cell and the vertex location shown in the bottom 
left panel. It is important to avoid too large offsets here since they intro- 
duce inaccuracies in the linear reconstruction step of the MUSCL-Hancock 
scheme. The "roundness" of the cells is measured by rj, which is closer to 
1 (the value of a sphere) for the regularised simulations. Again, this is a 
measure of the regularity of the Voronoi mesh. For each face of a Voronoi 
cell, AREPO needs to solve a Riemann problem. Since each face is only 
solved once, the average number of Riemann problems per cell is given by 
half the average number of its faces. The distribution of Voronoi faces per 
cell is shown in the top right panel. 



APPENDIX A: VORONOI MESH STATISTICS 

Here we present some basic geometric properties of the Voronoi 
mesh in our cosmological simulations. As discussed in Section 2, 
it is beneficial for the accuracy of AREPO for the Voronoi mesh 
to remain as regular as possible throughout the simulation. To this 
end, the motion of the mesh vertices is slightly modified for highly 
distorted cells in order to obtain a mesh that has more "roundish" 
cells and is closer to a centroidal Voronoi tessellation where geo- 
metrical the centres of the cells coincide with the vertex points of 
the individual Voronoi cells. 

We can quantify the quality of a mesh in a similar way as was 
done in SIO. There, mesh quality indicators were calculated for a 
non-radiative simulation of a massive cluster (the "Santa Barbara 
Cluster", I Frenk et al.|1999| ), and it was demonstrated that the mesh 
indeed preserved its desired properties of reasonably round cells 
during the simulation. However, in this paper we have used a mod- 
ified regularisation scheme based on the maximum face-angle, and 
we include cooling, star formation, and feedback, which drastically 
increases the dynamic range the simulations need to address. It 
is therefore important to verify whether our regularisation scheme 
performs sufficiently well in these more demanding simulations. 

In Fig. |A1| we first show in the top left panel the maximum 
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Figure A2. Top panel: Distribution of cell masses divided by the target gas 
mass for each AREPO simulation at z = 0. Our re-/derefinement scheme 
guarantees that the mass of a Voronoi cell never deviates by more than ap- 
proximately a factor of 2 from the target gas mass, which is in our case 
chosen to be the mean baryon mass of a cell for a uniform distribution as- 
suming all cells have equal volume. Bottom panel: Distribution of masses of 
the stellar particles divided by the imposed target gas mass for the Voronoi 
cells. Our constraints for the mass of cells and our scheme for creating star 
particles result in a quite narrow mass distribution of stellar particles, where 
each stellar particle deviates at most a factor of ~ 2 from the desired target 
gas mass, except for a tiny number of lighter star particles, whose mass is 
however always larger than a quarter of the target gas mass. 



face angle distribution at z = for all AREPO simulations. Our 
new regularisation scheme should guarantee that these angles do 
not get too large. We have set up the regularisation such that the 
mesh is steered towards face angles smaller than 1.68. As the up- 
per left panel of Fig. [AT] demonstrates, the mesh indeed avoids very 
large face angles with a maximum in the distribution around the de- 
sired value, i.e. the regularisation works as expected. The remaining 
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three panels of Fig. |A1| show distribution functions for the number 
of faces of the Voronoi cells, for the distance d of mesh-generating 
points to the geometric centres of their cell in units of the fidu- 
cial cell radius R — (3y/47r)^/^ of each cell, and finally, for the 
T] = S^^"^ / {Q^/t^V) parameter which measures how "roundish" 
each cell is (here S is the total surface area of a cell). The corre- 
sponding distributions for a random Poisson sample of 128^ vertex 
points are shown as black lines in all panels, for comparison. Note 
that we do not explicitly control the offset d in our regularisation 
scheme, but the face angle criterion is of course correlated with 
this quantity. This allows us to obtain also quite small values for 
the distance of the geometric centre of the cell from the location 
of the mesh generating points. This is important for the numerical 
accuracy of the linear reconstruction. Overall, our Voronoi mesh 
is significantly "rounder" and much closer to a centroidal Voronoi 
tessellation than the mesh of a random point distribution. 



The goal of our re- and derefinement strategy is to keep all cell 
masses close to a predefined target cell mass, which we have chosen 
to be the total baryonic mass in the box divided by the total num- 
ber of cells at the initial time. In this way, a narrow mass spectrum 
for the formed stellar particles is obtained, and a straightforward 
and even-handed comparison to the SPH calculations done with 
GADGET becomes possible. As we have described in Section 2, we 
guarantee the approximately constant mass resolution by splitting 
cells that reach a mass larger than 2 x mtarget into two cells, while 
cells dropping in mass below 0.5 x mtarget are dissolved. We note 
that thanks to the Lagrangian mesh motion in AREPO, these re- 
/derefinement operations are invoked only rarely. In the top panel 
of Fig. |A2| we demonstrate that the re- and derefinement scheme 
successfully keeps the cell masses in the desired mass range, yield- 
ing approximately a log-normal distribution around the target mass 
within the desired bounds. 



The mass distribution of stellar particles at z = is shown in 
the bottom panel of Fig. |A2| Since our star formation implementa- 
tion does not allow star particles to be formed with a mass larger 
than 2 x mtarget, the stellar particle mass distribution is cut off at 
that value. On the other hand, the derefinement operations are con- 
strained by the requirement that neighbouring cells cannot both be 
derefined in the same time step. As a result, it is possible that some 
cells can temporarily have masses below 1/2 x mtarget for a few 
time steps, and hence also some star particles with masses smaller 
than this value may form in principle. To protect against the forma- 
tion of unreasonably low mass star particles, which may be prone 
to two-body effects, we however do not allow cells with mass less 
than 1/4 X mtarget to form any star particles, imposing a lower 
limit in the star particle mass distribution. As the bottom panel of 
Fig. |A2| shows, star particles with masses between 1/4 x mtarget 
and 1/2 X mtarget make up only a tiny fraction, as desired, such 
that the suppression of the formation of extremely small star parti- 
cles does not lead to any appreciable error. Note that star-forming 
cells with a mass above 2 x mtarget will form stars with exactly 
mtarget, keeping the rest of the gas mass in the cell. This explains 
the small spike in the distribution at that mass value. Our analysis 
thus confirms that star particles with a narrow range of masses are 
formed, which helps to limit two-body relaxation effects and is well 
matched to the fixed gravitational softening we use for collisionless 
particles. 
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Figure Bl. Star formation rates per unit volume as a function of lookback 
time (top left) and redshift (bottom left). The panels on the right show the 
corresponding integrated star formation rates as a function of lookback time 
(top right) and redshift (bottom right). We show the result of the A_L20nl28 
simulation along with a the same run with a fixed (comoving) gravitational 
softening length and a run with a different mesh regularisation scheme. All 
runs show very similar star formation rates. 




Figure B2. Gas density maps of a matched galaxy of the simulation 
A_L20nl28 and a similar simulation with a different mesh regularisation 
scheme (based on the displacement of the geometric centre of the cell). 
As expected, the detailed mesh geometry and the shape of cells change if 
we employ a different regularisation scheme. But the resulting density field 
looks similar, and especially the overall extent of the disk is not significantly 
affected by the way the mesh is regularised. We note that L20nl28 is our 
lowest resolution and therefore most sensitive to details of how exactly the 
mesh is treated. These effects are significantly smaller for our intermediate 
resolution and nearly vanish for the highest resolution simulation. 



APPENDIX B: GRAVITATIONAL SOFTENING AND 
REGULARISATION SCHEME 

The AREPO simulations presented in this paper use an adap- 
tive gravitational softening length with a lower floor for the gas, 
whereas the SPH simulations done with GADGET employ a fixed 
comoving softening length equal to the lower floor. We checked 
that this does not bias our results in any way. To demonstrate this 
point we show in Fig. |B1| the star formation history of the stan- 
dard A_L20nl28 simulation together with a run where we held the 
gravitational softening of the cells fixed at the same value as in the 
SPH calculations. Clearly this leads only to very minor differences 
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in the star fomation history. We also checked that disk half mass 
radii are not affected by this. As stated above this is due to the fact 
that star forming gas is pressurised by an effective EOS of the ISM 
and therefore gravitational softening effects do not play an impor- 
tant role in that regime. 

The mesh regularisation in AREPO can be done in different 
ways and we have used a scheme that is based on the maximum 
face angle as described above. We have also done simulations with 
the original regularisation scheme presented in Spri ngel| p010a| l, 
which is based on the displacement of the geometric centre of the 
cell. The resulting star formation history for the AX20nl28 is also 
shown in Fig. |B1| Again this leads only to minor changes. We 
note that this difference becomes smaller with resolution and es- 
sentially vanishes. Regularisation scheme differences are only rel- 
evant for very low resolution simulations. Disk sizes are also not 
significantly affected by the details of the regularisation scheme. 
In Fig. |B2| we show gas density maps of a matched object in 
A_L20nl28 and the A_L20nl28 run, which features an alternative 
regularisation scheme. We stress again that L20nl28 is our lowest 
resolution setup, and therefore most sensitive to details of how ex- 
actly the mesh is treated. But even in this regime the differences in 
the gas density maps are very small, and the size and overall extent 
of the gas disk do not change. Note that due to the low resolution 
of L20nl28, individual Voronoi cells are clearly visible in the map, 
highlighting the changes in mesh geometry and cell shapes induced 
by different regularisation schemes. 



APPENDIX C: CODE PERFORMANCE 

The total runtime of our AREPO simulations is in the worst case 
only about 30% longer than the corresponding GADGET run at the 
same nominal resolution. Fig. |C1| shows in which code parts most 
of this CPU time is spent. Obviously, the rather complex mesh con- 
struction and mesh update in AREPO takes up a significant amount 
of time. However, in both codes the Tree-PM based gravity calcu- 
lation consumes a large amount of time, too, alleviating speed dif- 
ferences in the hydrodynamical solvers. In any case, we argue that 
AREPO is actually surprisingly fast given the extremely complex 
mesh management operations that are required. Given the small 
difference in raw speed, one can rightly describe the overall perfor- 
mance of both codes as similar. However, accounting for the fact 
that we here find that SPH leads to a systematic and significant off- 
set in the cooling rates of haloes at late times, as well as in galaxy 
sizes, the discussion of CPU time requirements at the same nominal 
resolution is a bit moot. Ideally one would like to select the fastest 
method for a given desired accuracy, and as it appears, this can be 
reached with AREPO more efficiently, whereas standard SPH's sys- 
tematic bias may be difficult or even impossible to overcome. The 
white parts in Fig. |C1| account for the time spent in the time inte- 
gration, star formation, cooling, and other minor contributions. We 
note that the I/O contribution is a bit larger than typical, because 
we wrote out snapshots at a very high frequency. 



APPENDIX D: STRONG AND WEAK SCALING 



miscellaneous 



In Fig. |Dl| we show a strong scaling test of AREPO, from 32 to 512 
cores, carried out on Ranger at the Texas Advanced Computing 
Center (TACC). Here, the simulation size has been kept constant 
at 2 X 256^, and only the number of compute cores has been in- 
creased. The reported times have been averaged for three full time- 
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Figure CI. Cumulative time spent in different parts of the code as a func- 
tion of time-step for AREPO (top panel) and GADGET (bottom panel). 
The Voronoi mesh contribution of AREPO includes mesh constructions 
and mesh updates. The total number of time-steps in AREPO is somewhat 
smaller because a larger CFL factor can be used, which makes the smallest 
time-step of AREPO larger than that of GADGET. The remaining fraction 
of computing time is spent in the time integration, star formation, cooling, 
and other minor contributions. 



steps at high redshift. In this regime, gravity accounts for ^ 33% of 
the computational time, mesh-construction and mesh bookkeeping 
consumes ^ 40% of the time, while the calculation of the hydro- 
dynamical fluxes amounts to ^ 10% of the time. The remainder is 
needed for miscellaneous items such as domain decomposition, ra- 
diative cooling and star formation. In order to clearly show the scal- 
ability of the most important different parts of our code, we have 
included separate measurements in Fig. |D1| for long-range grav- 
ity (by means of FFTs), short-range gravity (done through a tree- 
walk), the necessary tree construction, the mesh construction and 
management, the hydrodynamic flux calculations, and the domain 
decomposition. We see that the code shows quite good strong scal- 
ing, despite the tightly coupled nature of the system. Some losses 
are in particular apparent for the mesh construction. These arise 
because the more cores are used, the larger the number of spatial 
domains in which the simulation volume is cut. This enlarges the 
surface area of domain boundaries, and as a result the cumulative 
"ghost" region volume in which the mesh has to be constructed 
twice on two neighbouring processes to ensure seamless consis- 
tency across the domain boundary. 

For large-scale cosmological applications it is more important 
that the code parallelisation shows good weak scaling behaviour. 
GADGET has been used for many large-scale simulations and we 
will demonstrate here that AREPO performs also well in weak scal- 
ing tests. For these tests the computational load per core is kept con- 
stant, and the simulation volume and particle/cell number is scaled 
accordingly. In Fig. |D2| we show two different weak scaling series 
carried out on Ranger at the TACC. They differ in their particle and 
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Figure Dl. Strong scaling test for the AREPO code on Ranger, using be- 
tween 32 and 512 cores. In all runs an identical simulation size of 2 x 256^ 
in a 12.5 Mpc box was used, with a 512^ FFT for the long-range grav- 
ity calculation. The black solid line shows the total wall-clock time for a full 
step as a function of the number of cores used, averaged over three steps at 
high redshift. The code was run with all physics enabled, and all code over- 
head was included in these averages. The other measurements included in 
the figure give the times for the most important individual parts of the code, 
which are the Voronoi mesh construction, the short-range and long-range 
gravity calculations, the hydrodynamical flux calculation, the domain de- 
composition and the tree construction. The black dashed line indicates the 
ideal strong scaling. 



cell load per core. In the results shown in the top panel, this load 
was 'maximal', corresponding to ~ 1 million cells and particles per 
core, where the current version of AREPO requires up to 1600 MB 
memory consumption per core in the peak. Because some memory 
is needed for the operating system and the MPI communication li- 
brary, not all of the physical memory is available for the application 
code on the Ranger platform. Unfortunately, the amount of mem- 
ory consumed by the MPI subsystem increases with increasing size 
of the MPI partition. This in fact prevents us from running the "full 
load" configuration for partitions larger than 4096 cores. This also 
prompted us to create a second weak scaling series shown in the 
bottom panel of Fig. |D2| where we reduced the load by almost a 
factor of 2. This allows us to scale AREPO up to 6912 cores with 
6912 MPI tasks. 

It is apparent that the short-range gravitational tree calcula- 
tion, the Voronoi mesh construction, and the hydrodynamical flux 
calculation show excellent weak scaling (which runs horizontally 
in the plots of Fig. |D2| ). However, there are deviations from perfect 
scalability for the FFT-based long-range gravitational force calcu- 
lation, the domain decomposition, and the tree construction. This is 
primarily because all three of these parts involve substantial all-to- 
all communication. For larger processor counts, especially for 6912 
cores, these communication costs start to affect the weak scalability 
of our code and lead to losses in efficiency. One reason for the sub- 
optimal scaling in this regime lies in the FFT part of the code. The 
slab-based parallelisation of the FFTs we use for the long-range 
gravity calculation does not scale to core numbers larger than the 
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Figure D2. The two panels show weak scaling plots for the AREPO code 
on Ranger, using from 32 up to 6912 cores. In all runs of each of the two 
series, the load per core was held constant, and the simulation box-size and 
FFT size used was scaled in proportion to the core number. For ideal weak 
scaling, the code speed in all of its parts should then be constant, despite 
the fact that the simulation size grows by a factor of more than 200 in these 
tests. We show in black the total times averaged for three full steps, and with 
different colours the most important code parts, as labelled. The top panel 
refers to a series with "fully loaded" nodes, where we use a load that is close 
to the maximum we can fit into the memory available per Ranger core. In 
the bottom panel, we have reduced the load by a factor of 2, allowing us to 
extend the tests to a core count of 6912 (for such large MPI partition sizes, 
less application memory remains available on the specific machine). Here, 
however, the communication costs in three parts of our code become quite 
substantial and lead to a noticeable negative impact of the scalability. The 
particle load of the top panel goes from 2 x 256^ to 2 x 1280*^, whereas 
for the bottom panel it goes from 2 x 214^ to 2 x 1280^ . The black dashed 
line indicates the ideal weak scaling. 



size of the FFT, a regime we enter once more than 4096 cores are 
used (because the FFT-size reaches 4096^ at this point and cannot 
be grown further due to memory constraints). We note that this can 
easily be optimised by either using a block-structured FFT decom- 
position or by using a mixed approach for the parallelisation with 
either threads or OpenMP. Overall we find that AREPO shows good 
weak scaling behaviour and can readily be applied to large-scale 
cosmological hydrodynamics simulations. 
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